benchmarking of volte services a first field experience · a rohde & schwarz company...

18
A Rohde & Schwarz Company Benchmarking of VoLTE Services A first field experience March 2015 SwissQual AG Allmendweg 8 CH-4528 Zuchwil Switzerland t +41 32 686 65 65 f +41 32 686 65 66 e [email protected] www.swissqual.com Part Number: 12-070-200912-4

Upload: others

Post on 25-Dec-2019

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Benchmarking of VoLTE Services A first field experience · A Rohde & Schwarz Company Benchmarking of VoLTE Services A first field experience March 2015. SwissQual AG Allmendweg 8

A Rohde & Schwarz Company

Benchmarking of VoLTE Services

A first field experience

March 2015

SwissQual AGAllmendweg 8 CH-4528 Zuchwil Switzerland

t +41 32 686 65 65 f +41 32 686 65 66 e [email protected] www.swissqual.com

Part Number: 12-070-200912-4

Page 2: Benchmarking of VoLTE Services A first field experience · A Rohde & Schwarz Company Benchmarking of VoLTE Services A first field experience March 2015. SwissQual AG Allmendweg 8

SwissQual has made every effort to ensure that eventual instructions contained in the document are adequate and free of errors and omissions. SwissQual will, if necessary, explain issues which may not be covered by the documents. SwissQual’s liability for any errors in the documents is limited to the correction of errors and the aforementioned advisory services.

Copyright 2000 - 2015 SwissQual AG. All rights reserved.

No part of this publication may be copied, distributed, transmitted, transcribed, stored in a retrieval system, or translated into any human or computer language without the prior written permission of SwissQual AG.

Confidential materials.

All information in this document is regarded as commercial valuable, protected and privileged intellectual property, and is provided under the terms of existing Non-Disclosure Agreements or as commercial-in-confidence material.

When you refer to a SwissQual technology or product, you must acknowledge the respective text or logo trademark somewhere in your text.

SwissQual®, Seven.Five®, SQuad®, QualiPoc®, NetQual®, VQuad®, Diversity® as well as the following logos are registered trademarks of SwissQual AG.

Diversity Explorer™, Diversity Ranger™, Diversity Unattended™, NiNA+™, NiNA™, NQAgent™, NQComm™, NQDI™, NQTM™, NQView™, NQWeb™, QPControl™, QPView™, QualiPoc Freerider™, QualiPoc iQ™, QualiPoc Mobile™, QualiPoc Static™, QualiWatch-M™, QualiWatch-S™, SystemInspector™, TestManager™, VMon™, VQuad-HD™ are trademarks of SwissQual AG.

SwissQual acknowledges the following trademarks for company names and products:

Adobe®, Adobe Acrobat®, and Adobe Postscript® are trademarks of Adobe Systems Incorporated.

Apple is a trademark of Apple Computer, Inc.

DIMENSION®, LATITUDE®, and OPTIPLEX® are registered trademarks of Dell Inc.

ELEKTROBIT® is a registered trademark of Elektrobit Group Plc.

Google® is a registered trademark of Google Inc.

Intel®, Intel Itanium®, Intel Pentium®, and Intel Xeon™ are trademarks or registered trademarks of Intel Corporation.

INTERNET EXPLORER®, SMARTPHONE®, TABLET® are registered trademarks of Microsoft Corporation.

Java™ is a U.S. trademark of Sun Microsystems, Inc.

Linux® is a registered trademark of Linus Torvalds.

Microsoft®, Microsoft Windows®, Microsoft Windows NT®, and Windows Vista® are either registered trademarks or trademarks of Microsoft Corporation in the United States and/or other countries U.S.

NOKIA® is a registered trademark of Nokia Corporation.

Oracle® is a registered US trademark of Oracle Corporation, Redwood City, California.

SAMSUNG® is a registered trademark of Samsung Corporation.

SIERRA WIRELESS® is a registered trademark of Sierra Wireless, Inc.

TRIMBLE® is a registered trademark of Trimble Navigation Limited.

U-BLOX® is a registered trademark of u-blox Holding AG.

UNIX® is a registered trademark of The Open Group.

Page 3: Benchmarking of VoLTE Services A first field experience · A Rohde & Schwarz Company Benchmarking of VoLTE Services A first field experience March 2015. SwissQual AG Allmendweg 8

Benchmarking of VoLTE Services

A first field experience

© 2000 - 2015 SwissQual AG

CONFIDENTIAL MATERIALS ii

Contents

Benchmarking of VoLTE Services A first field experience ......................................................................... 0

1 VoLTE as a new voice service ............................................................................................................ 1

2 VoLTE - Technicals very simple ......................................................................................................... 2

3 How to benchmark VoLTE? ................................................................................................................ 2

VoLTE Call Setup .................................................................................................................................. 3

VoLTE Audio Transmission ................................................................................................................... 5

Audio quality in VoLTE .......................................................................................................................... 6

Audio quality and transcoding ............................................................................................................... 7

Audio delay ............................................................................................................................................ 8

Absolute mean audio delay .............................................................................................................. 9

Variable audio delay ....................................................................................................................... 10

Measuring and evaluation of variable audio delay ......................................................................... 11

4 Conclusion ......................................................................................................................................... 14

Page 4: Benchmarking of VoLTE Services A first field experience · A Rohde & Schwarz Company Benchmarking of VoLTE Services A first field experience March 2015. SwissQual AG Allmendweg 8

Benchmarking of VoLTE Services

A first field experience

© 2000 - 2015 SwissQual AG

CONFIDENTIAL MATERIALS iii

Figures

Figure 1: Basic flow in a VoLTE to VoLTE connection ...................................................................................... 2

Figure 2: Example of SIP message flow in a VoLTE to VoLTE call .................................................................. 3

Figure 3: Call Set-Up times for mobile to PSTN connections ............................................................................ 4

Figure 4: Call Set-Up times for mobile to mobile connections........................................................................... 4

Figure 5: Voice codec information in VoLTE RTP header ................................................................................. 5

Figure 6: IP Throughput in a VoLTE call ........................................................................................................... 5

Figure 7: Examples for listening quality for VoLTE in comparison to 3G calls .................................................. 6

Figure 8: Examples of quality distribution for VoLTE and 3G calls ................................................................... 7

Figure 9: Examples of audio delay in mobile to PSTN calls .............................................................................. 9

Figure 10: Examples of audio delay in VoLTE and 3G mobile to mobile calls .................................................. 9

Figure 11: Basic principle of a jitter buffer to compensate packet delay jitter ................................................. 10

Figure 12: Example of an aligned pair of reference and degraded signal ....................................................... 11

Figure 13: Example of variable delay in a VoLTE live test sample ................................................................. 12

Figure 14: Example of variable delay in a 3G live test sample........................................................................ 12

Figure 15: Occurrence of delay changes for VoLTE and 3G live network samples ........................................ 13

Figure 16: P.863 MOS-LQO statistics in relation to delay changes in the test sample ................................... 13

Page 5: Benchmarking of VoLTE Services A first field experience · A Rohde & Schwarz Company Benchmarking of VoLTE Services A first field experience March 2015. SwissQual AG Allmendweg 8

Benchmarking of VoLTE Services

A first field experience

© 2000 - 2015 SwissQual AG

Chapter 1 | VoLTE as a new voice service

CONFIDENTIAL MATERIALS

1

Foreword

In this white paper we take an in depth look at the performance of recently launched commercial VoLTE networks by evaluating speech quality. The information presented is based on real world data collected from an end-to-end perspective – simulating the end user experience. The data used was collected between June of 2014 and January 2015 on three different VoLTE operators in the US. This white paper does not cover core network architecture and call handling within the network. What is covered is the attributes of a VoLTE network and how they impact voice quality. Test methodology is also covered in detail.

This document will provide first results and methodologies of testing and benchmarking VoLTE. It treats VoLTE from an end-to-end measurement perspective. The focus of the document revolves around the typical measurements made in all voice telecommunications networks.

1 VoLTE as a new voice service

LTE networks first launched commercially in 2010. Since that time, there has been steady acceleration in the deployment of LTE networks globally. It is fair to say the technology has proven to be very successful in the real world with the ever increasing demands for more and more data by mobile device users. LTE networks offer well documented bandwidth advantages over the legacy 3G networks and their derivative data services (UMTS, HSPA, EVDO), even though there has been significant improvement in the bandwidth of these 3G based networks. In principle, LTE is more a concept than a fixed technology. It opens the doors forwards to a step-wise evolvement in capacity and speed. LTE is well on its way to becoming the basic transmission technology for all networks around the world, providing an excellent way forward for GSM and CDMA based networks currently deployed.

Since LTE is a data only technology (no circuit switched voice capability), voice telephony to date has been handled by circuit switch fall back to either a CDMA or GSM based voice service. This adds complexity to the device being used. For instance, GSM based carriers must fall back to a UMTS (or HSPA) data connection when a voice call is established. The voice call requires a circuit switched connection (2G/3G) and for data, a parallel connection cannot stay in LTE so it must fallback to UMTS or HSDPA too. In CDMA networks, the device must have a CDMA based voice call and a second radio chipset for LTE if simultaneous voice and data are enabled. For these reasons along with the inherent spectral efficiency of LTE, voice calls over LTE are a natural next step. In LTE a dedicated voice call service, known as Voice over LTE (VoLTE), is a high priority for many LTE operators globally. As of late 2014, several LTE operators launched VoLTE services, albeit in a conservative manner. Although VoLTE should not be confused with VoIP (voice over internet protocol), there is a desire by the industry for the public to associate these two. VoLTE signalling, Call-Setup and voice transmission are very close to common SIP-based VoIP services. The most important difference is the deep integration of VoLTE as a service into the mobile core network.

Typically, VoIP services are installed over the top (OTT) of a data connection, usually as an application on a smart phone. It runs as a normal data service, uses the default data bearer and has to share it with other active services. Usually, there is no integration in the phone’s call client. By not being integrated into a phone’s call client, features such as call waiting, three way calling, call forwarding, etc. are not available. In addition, if you are on an OTT VoIP client, there is a good chance your call will drop if a “real” voice call comes in.

Network operators stand to benefit substantially by deploying VoLTE services. Primarily, operators will gain spectrum by pushing voice calls to VoLTE. The more voice traffic moved to LTE, the more operators can re-farm 3G spectrum to LTE spectrum. The end game will be to eliminate all 2G/3G networks for a single LTE network. Subscribers stand to benefit as well. Carriers will be able to offer HD Voice, video chat, fast call set up times, and better battery life, just to name a few. For all its promise, network operators will need to proceed with caution. Nothing will be more catastrophic than poor performance on voice calls. Even though most subscribers are gobbling up data at unprecedented rates, their voice experience will define the overall quality of a carrier.

Page 6: Benchmarking of VoLTE Services A first field experience · A Rohde & Schwarz Company Benchmarking of VoLTE Services A first field experience March 2015. SwissQual AG Allmendweg 8

Benchmarking of VoLTE Services

A first field experience

© 2000 - 2015 SwissQual AG

Chapter 2 | VoLTE - Technically very simple

CONFIDENTIAL MATERIALS

2

2 VoLTE - Technically very simple

From a technical perspective, a VoLTE call is quite straightforward. The first step is for the UE to establish a data connection on the LTE network. If the UE is VoLTE capable it will then attempt to register on the IMS (IP Multimedia Subsystem) Server. The IMS server is unique to each carrier and not accessible from the public domain, which is unlike a typical VoIP server that is connected from the public internet.

Once the UE is registered on the IMS Server, the server will manage all voice connections. The communication protocol between the IMS Server and the UE is SIP (Session Initiation Protocol) and the underlying protocol is TCP. All of these transactions will happen in the background, often not visible to the user. Some UEs will have an icon that shows up in the display indicating the device is properly registered on the IMS Server.

In the case of a call request, the call setup is managed by the IMS server and either linked to another LTE device (as shown below), to a circuit switched mobile, or to a fixed line subscriber.

Figure 1: Basic flow in a VoLTE to VoLTE connection

The voice connection itself uses RTP over a UDP connection between the two mobile phones. For calls from a VoLTE device to a circuit-switched technology, there is a gateway that the IMS Server communicates with. The voice codecs utilized are the same as in today's mobile networks such as AMR and AMR-WB. Instead of framing the voice stream gets packetized. In the near future a new voice codec, EVS (Enhanced Voice Service), will be available.

The applied audio pre-processing is also the same for legacy calls and VoLTE. Gain control, noise suppression and DTX are applied the same way.

3 Testing and Benchmarking VoLTE

From the user's perspective nothing is visible about the technology used for a voice call. The user simply dials a number and presses the send button. He/She doesn’t care if the call is made in 2G/3G or in a VoLTE environment. All the user really cares about is if the call went through, the voice quality, and call retainability (in our industry we refer to these KPIs as blocks, MOS, and drops). The same should be true from the end-to-end testing perspective. When VoLTE is tested as a service it must be benchmarked using the same KPIs and statistics as a normal voice service.

Operators are interested in the call setup time, audio transmission time, handovers, radio conditions, and much more.

Page 7: Benchmarking of VoLTE Services A first field experience · A Rohde & Schwarz Company Benchmarking of VoLTE Services A first field experience March 2015. SwissQual AG Allmendweg 8

Benchmarking of VoLTE Services

A first field experience

© 2000 - 2015 SwissQual AG

Chapter 3 | Testing and Benchmarking VoLTE

CONFIDENTIAL MATERIALS

3

In general, the meaning of KPIs and metrics are the same for VoLTE as for legacy technologies. This is required if operators have to benchmark a VoLTE service with a legacy voice call in 2G / 3G under comparable conditions. When we look deeper into the technical details that enable a VoLTE call to function, it is fairly straightforward when we evaluate a VoLTE to VoLTE call within the same IMS Server. Beyond this simple case, however, complexities quickly arise. The following call types need special attention to ensure KPI are not compromised.

- Subscriber is in LTE, having VoLTE service but calling a 2G/3G/PSTN subscriber

- Subscriber is in LTE, having an established VoLTE call but loosing LTE coverage (SRVCC, Single Radio Voice Call Continuity)

- Subscriber is in LTE, having no VoLTE service, just LTE data access (CSFB, Circuit-Switched Fall-Back)

The following sections provide some real world examples for analysis. These measurement results were taken from live VoLTE networks using commercially available devices. These numbers illustrate the important parameters associated with running a quality VoLTE network. They can be used for setting a benchmark upon which to improve, or to evaluate one carrier vs. another.

VoLTE Call Setup

Call setup time is one of the most important KPIs when measuring voice quality. From a user perspective, this is the amount of time it takes to get connected and the audio channel is opened after the send button is pressed. These types of call events can be measured in a VoLTE environment by looking into SIP messaging. Once the VoLTE device is registered in the IMS server, the call setup and handling is taken care of by SIP messaging. The SIP connection itself is usually encrypted, but it depends on the individual settings for VoLTE by the operator. As shown in Figure 2, SIP provides the main states of the call flow and the required trigger-points to calculate the main KPIs, Call Setup Time and Post Dial Delay. The list on the left is the calling phone and the list on the right shows the SIP messages from the receiving phone.

Figure 2: Example of SIP message flow in a VoLTE to VoLTE call

It’s important for operators to distinguish between various call types when analyzing call setup time metrics. For mobile to mobile calls, VoLTE to VoLTE call results should be separated from VoLTE to 2G/3G calls. These two types of voice calls make for an interesting comparison, particularly when compared to a traditional 3G to 3G call. In addition to analyzing the mobile to mobile differences, it is important to analyze the differences between 2G/3G calls to a PSTN and VoLTE calls to a PSTN As mentioned above, carriers can derive benefit from comparing these different call types to ensure that newer services such as VoLTE do not provide a more negative experience to customers than what they are used to. In addition, benchmarking one carrier to another carrier can provide competitive feedback on areas where improvement is needed to be better than the competition.

An example of a drive test campaign conducted in early 2015 running a few hundred calls shows a visible difference in between the categories. The test was conducted across 3 carriers (A, B, and C) all with VoLTE services enabled.

Page 8: Benchmarking of VoLTE Services A first field experience · A Rohde & Schwarz Company Benchmarking of VoLTE Services A first field experience March 2015. SwissQual AG Allmendweg 8

Benchmarking of VoLTE Services

A first field experience

© 2000 - 2015 SwissQual AG

Chapter 3 | Testing and Benchmarking VoLTE

CONFIDENTIAL MATERIALS

4

Figure 3: Call Set-Up times for mobile to PSTN connections

The measured Call Set-Up Times are almost statistically similar across all carriers for the 3G to VoLTE call scenario. It appears there is a slight advantage in VoLTE to PSTN calls, particularly on Carrier A, but the amount of data does not allow a statistical significance analysis for confirmation.

Given that mobile phone to mobile phone traffic is on the rise, it is important to evaluate the results of the various call scenarios that exist. The Call Set-Up Times in mobile to mobile connections can be sub-divided in three categories, legacy 3G to 3G and VoLTE to VoLTE connections but in cross-technology calls from CS to VoLTE too.

Figure 4: Call Set-Up times for mobile to mobile connections

If analysing these cases there is a clear advantage for VoLTE. Even in calling a VoLTE client from 3G shows a shorter Call Set-Up Time than a common 3G to 3G call, but a pure VoLTE to VoLTE call set-up just takes less than 60% in time compared to 3G to 3G in this real world data.

Please consider this as an example, data where gained in just three arbitrarily chosen networks, in one arbitrary chosen region in a two days test drive.

1 However, the Call Setup Time seems to benefit significantly

from a move to VoLTE support.

1 It has to be noted that LTE capable devices are in LTE in idle mode and fall back to 3G/2G in the event of a voice call.

This CSFB called strategy extends the call setup in 3G/2G by around 1s compared to phones that have no LTE support at all and CSFB is not applied.

Page 9: Benchmarking of VoLTE Services A first field experience · A Rohde & Schwarz Company Benchmarking of VoLTE Services A first field experience March 2015. SwissQual AG Allmendweg 8

Benchmarking of VoLTE Services

A first field experience

© 2000 - 2015 SwissQual AG

Chapter 3 | Testing and Benchmarking VoLTE

CONFIDENTIAL MATERIALS

5

VoLTE Audio Transmission

Technically speaking, the audio processing in VoLTE is not different from legacy circuit-switched calls. The speech codecs are the same, as are frame sizes and loss concealment strategies. Even though VoLTE supports a wide range of speech codecs, today mainly AMR-WB is integrated. In the near future the EVS codec may be integrated. AMR-WB offers 7 kHz audio bandwidth where EVS will offer 20 kHz audio bandwidth. For the end user, this will lead to superior audio quality over traditional telephony.

Figure 5: Voice codec information in VoLTE RTP header

Compared to a circuit-switched call the voice transmission is visible at the IP layer. The following graph illustrates the bit-streams in uplink and downlink directions.

Figure 6: IP Throughput in a VoLTE call

The visible pulse-like transmission is caused by a half-duplex transmission of voice samples in either direction. Each sample consists of two sentences, and even this speech-pause-speech pattern can be recognized in the IP throughput.

Page 10: Benchmarking of VoLTE Services A first field experience · A Rohde & Schwarz Company Benchmarking of VoLTE Services A first field experience March 2015. SwissQual AG Allmendweg 8

Benchmarking of VoLTE Services

A first field experience

© 2000 - 2015 SwissQual AG

Chapter 3 | Testing and Benchmarking VoLTE

CONFIDENTIAL MATERIALS

6

Audio quality in VoLTE

VoLTE is a new service; it is based in the use of AMR-WB in practice. Even though other, older codecs are supported technically, no commercially released devices have to fallback to these older codecs. This makes VoLTE different to legacy CS voice services, where typically AMR narrowband is the common ground today.

At first just a few typical results of drive-test campaigns will be shown for illustration. In each scenario several hundred calls are processed and more than 1000 single test samples have been averaged to get the mean score. The data collection was done in summer 2014, mainly in U.S. networks.

The following graph (Figure 7) shows results collected in early drive test campaigns comparing VoLTE to traditional CS mobile to mobile calls. The figure shows the MOS derived by ITU-T P.863 ‘POLQA’ in super-wideband mode in different scenarios.

At the left most column (3G-3G NB) shows the results for legacy mobile to mobile calls with transcoding. The MOS is lower because there are two coding steps and the transmission is based on AMR (NB) instead of AMR-WB (narrower audio bandwidth).

In the next column (3G-3G TrFO) the average results for AMR-WB transcoding-free connections (TrFO) are shown. The average score is – of course – higher than for the red column, even though the RF conditions might be the same. This is due to the reduction to just one compression step and – more important – to the higher audio bandwidth offered by the AMR-WB speech codec.

Of course, a wideband channel can be benchmarked to a narrowband channel, but in this case narrowband will be scored lower just because of the band limitation, despite the RF network conditions are equally good or even better. It is important to note that ITU-T P.863, when set into the Super Wideband Mode, will perceive a good quality traditional mobile phone call as a low score such as 3.23 because the expectation is for a wide band audio call. From a benchmarking perspective, the wide band audio will sound much better than a narrow band audio call, hence a significantly better MOS value for wide band calls.

The resulting question is:

When comparing a VoLTE to VoLTE connection directly to a circuit-switched channel, what is the corresponding counterpart in 2G/3G technology?

If comparing VoLTE to VoLTE, it is fairer to compare to such transcoding-free AMR-WB connections. The two most right columns are based on VoLTE to VoLTE connections. Both campaigns are made by using the same driving route in the same network, just with different VoLTE devices.

Figure 7: Examples for listening quality for VoLTE in comparison to 3G calls

It is visible that a VoLTE to VoLTE connection is in the same quality range of a 3G mobile to mobile connection if TrFO is used. This seems logical since the coding technology is exactly the same. What

Page 11: Benchmarking of VoLTE Services A first field experience · A Rohde & Schwarz Company Benchmarking of VoLTE Services A first field experience March 2015. SwissQual AG Allmendweg 8

Benchmarking of VoLTE Services

A first field experience

© 2000 - 2015 SwissQual AG

Chapter 3 | Testing and Benchmarking VoLTE

CONFIDENTIAL MATERIALS

7

accounts for the small difference (3.74 vs 3.66) in voice quality that the VoLTE based calls have over the 3G calls if the coding is the same?

The explanation can be found in the distribution of the MOS values shown in Figure 8. For both, the maximum quality is determined by the used compression technology that is AMR-WB 12.65kbps. The maximum value recorded for AMR-WB falls in the range of a MOS value of 4.0 to 4.2. Note there is no technology dependent disadvantage with VoLTE or 3G.

Figure 8: Examples of quality distribution for VoLTE and 3G calls

However, the slope towards lower MOS values is steeper for VoLTE-VoLTE calls in these campaigns. Essentially, the occurrence of slightly distorted voice clips is less common with VoLTE based calls than with 3G based calls (MOS values in the range of 3.2 to 3.8). The occurrence of considerable distorted clips below MOS 3.2 remains the same again, but they are quite rare in both cases.

What could be the reason for that? The typical cause for small impacts on quality is frame replacement by the decoder. In 3G it is usually due to discarded erroneous frames that are replaced. For VoLTE it is the same and in addition delayed packets are discarded too. Finally, the occurrence of those problems is smaller in the tested LTE connections. It can be partially caused by the low load in the live LTE network; however, these numbers are just examples and may change when VoLTE is fully used as a service.

Audio quality and transcoding

In a pure VoLTE to VoLTE connection the voice stream is compressed once, transmitted and decoded with an AMR-WB, typically at 12.65kbps. The same is true for a 3G TrFO connection. However, today transcoding may still happen when the VoLTE packet-stream is converted into a legacy technology, usually into AMR 12.2kbps or in G.711 A-/µ-Law.

This transcoding will not only insert an additional coding step, it also reduces the audio bandwidth from 7kHz wideband to traditional narrow bandwidth 3.1kHz-band telephony. Since VoLTE subscribers are a minority today, they will experience wideband quality only in cases they are calling another VoLTE subscriber in the same network. If calling a mobile subscriber in 3G or 2G they will lose wideband and have a further disadvantage by the additional transcoding to AMR narrowband.

VoLTE technology has just started to be deployed; we will recognize improvements and better interoperability over the next months and years. Transcoding free IP transmission will ultimately become standard in core networks and may even serve IP-capable fixed network devices. As evidence of this rapid evolution, we see one operator transferring AMR-WB calls from 3G to VoLTE without losing wideband audio (based on real world data collected in January 2015).

Page 12: Benchmarking of VoLTE Services A first field experience · A Rohde & Schwarz Company Benchmarking of VoLTE Services A first field experience March 2015. SwissQual AG Allmendweg 8

Benchmarking of VoLTE Services

A first field experience

© 2000 - 2015 SwissQual AG

Chapter 3 | Testing and Benchmarking VoLTE

CONFIDENTIAL MATERIALS

8

Audio delay and conversation quality

A fixed audio delay as a physical transmission time of the voice signal is not directly related to listening audio quality. In a pure listening situation a fixed delay is not distorting, not even perceptible. A long audio delay gets annoying in a conversation, where a fluent interaction becomes more and more difficult. Rather than referring to this as audio quality, we refer to this element of voice calls as “conversation quality.” Conversation quality deteriorates as audio delays increase. In Public Switched Telephone (PSTN) networks, the audio delay has traditionally been less than 100ms and was not considered detrimental to typical conversation quality. Since the 1970s there has been rapid expansion of global telecommunications systems that are interconnected, primarily brought on as a byproduct of a more global economy and the “shrinking world.” With this come longer distance connections and especially satellite links that have contributed to longer audio delays in our speech communications. Audio delays of several hundred milliseconds to more than one second were not uncommon. Today, even though satellite links are not used anymore, open VoIP connections in OTT applications have audio delays that are annoyingly large.

Compared to pure PSTN connections, mobile networks experience longer audio delays. With the introduction of 2G and 3G networks, audio delay grew into a moderate problem again. Due to framing, interleaving and the air-link itself, a typical mobile to PSTN connection goes easily above 150ms in delay. For a mobile-mobile connection audio delays can be pretty bad. Depending on the network setup, one-way audio delays may hit the 400ms threshold.

However, in circuit-switched connections including 2G and 3G mobile channels, there is a fixed delay but no further variation. All voice frames get received after a certain transmission time but this time is essentially the same for all voice frames. This means that there is no delay variation when transmitting the speech.

2

For VoLTE the framing and queuing is different. Audio is transmitted in a packet-stream and affected by variations in transmission times, typical for packet-switched connections. To deal with these variations (jitter), in receiving the packets, VoLTE receivers (the same as any VoIP receiver) make use of a so-called jitter buffer. Received packets will be filled in this queue before getting decoded. This allows the decoder to take packets out of the queue at the correct time. Since there are always some packets buffered, the VoLTE client can assemble them properly for fluid playback of audio to the subscriber.

As usual, there is a trade-off. On one hand, a large jitter buffer can deal well with a high varying packet delay. On the other hand, the large jitter buffer is additive to the audio delay, locking in compromised conversation quality. Conversely, if a very short jitter buffer is deployed (preserving conversation quality), the buffer has a higher probability of being emptied, referred to as a buffer underrun. A buffer underrun will cause a gap in audio decoding, since, the buffer is empty, and there is no packet available anymore. The decoder cannot decode and deliver voice anymore. In a simple case, there is just silence, perceived as a gap, ultimately impacting audio quality. Fortunately, VoLTE devices have sophisticated strategies that help maintain audio quality even while jitter buffers potentially underrun. For instance, if a jitter buffer is running low, the client can attempt to insert the gaps caused by a buffer underrun into a point in the audio where there is silence. If there is no silent area to insert gaps, the client can repeat packets. This preserves audio quality but results in received audio that is “stretched” to fill the blanks caused by buffer underruns. It’s important to understand that the mean delay of the voice itself has no influence on listening, just on conversation quality. But un-compensated delay-jitter may lead to actions by the decoder causing perceptible distortions in audio quality. Carriers implementing VoLTE have to find the balance between audio quality and conversation quality (audio delay). For this reason, pure MOS measurements must be supplemented by audio delay measurements to get the complete “conversation quality” picture.

2 There are rare cases, where in handovers one frame gets lost or duplicated, but it results in a delay variation of one

voice frame (20ms).

Page 13: Benchmarking of VoLTE Services A first field experience · A Rohde & Schwarz Company Benchmarking of VoLTE Services A first field experience March 2015. SwissQual AG Allmendweg 8

Benchmarking of VoLTE Services

A first field experience

© 2000 - 2015 SwissQual AG

Chapter 3 | Testing and Benchmarking VoLTE

CONFIDENTIAL MATERIALS

9

Absolute mean audio delay

The absolute audio delay, or a mean delay, describes the mean transmission time from mouth to ear, from one side to the other. It influences the interaction in a conversation and in principle the shorter the mean audio delay, the better the conversation quality. For a comprehensive explanation one the effect excessive audio delay has from a human perception standpoint, please refer to ITU-T recommendation G.114.

As explained before, in VoLTE the audio delay strongly depends on the length of the jitter buffer in the user’s device. Early device implementations experienced unacceptably long audio delays, but now we are seeing the first consumer devices on the market have well designed clients with more than acceptable delays.

Figure 9 below shows example measurements made in early 2015 in live VoLTE and 3G networks in U.S.

For voice calls between mobiles and a PSTN line, there are two setups to benchmark, 3G to PSTN and VoLTE to PSTN.

Figure 9: Examples of audio delay in mobile to PSTN calls

In fact there are small differences between the three analyzed operators but they are not dramatic from a perceptual point. On average the delay is just below 200ms regardless of the call originating in 3G or VoLTE.

The evaluation of mobile to mobile connections can be split into three setups for benchmarking, there is common 3G to 3G and VoLTE to VoLTE but the cross-technology setup 3G – VoLTE too.

Figure 10: Examples of audio delay in VoLTE and 3G mobile to mobile calls

As expected, a mobile to mobile connection in circuit-switched 3G with common transcoding shows the highest audio delay. There are two air-links but two compression / de-compression steps applied. The shown average of 300ms is a fair value, but there are many examples of networks with more than 400ms of audio

Page 14: Benchmarking of VoLTE Services A first field experience · A Rohde & Schwarz Company Benchmarking of VoLTE Services A first field experience March 2015. SwissQual AG Allmendweg 8

Benchmarking of VoLTE Services

A first field experience

© 2000 - 2015 SwissQual AG

Chapter 3 | Testing and Benchmarking VoLTE

CONFIDENTIAL MATERIALS

10

delay – barely acceptable. Almost identical same values in audio delay were measured when calling a VoLTE device from a 3G device.

In both cases there is one operator ‘C’ who applies Transcoding-Free Operation that enables a much shorter transmission time because one coding/decoding step is avoided. This operator ‘C’ is even processing 3G to VoLTE calls without transcoding and keeps the audio wideband capability across the technologies! This will not only result in a short audio delay but also significantly increased audio quality.

When looking at VoLTE to VoLTE, the audio delay is much shorter than in a normal 3G call; however it results in about the same delay as measured with a 3G to 3G call using AMR-WB TrFo.

The given figures are real field examples of well implemented technology and designed devices. In practice there might be delays much higher than these, a delay measurement may help to optimize towards those almost perfect values.

Variable audio delay

The absolute audio delay just describes the delay of an audio stream to its origin at the sending side as a fixed offset. The signal is sent at a point in time t0 and is received at t0 + c, where c is constant. This is not the whole story in regards to audio delay. In a VoLTE environment, parts of the voice (packets) can be received with a different latency, which means that c is not a constant anymore rather varies in time. In practice, there is a given delay and the variation is on top of it.

How can this variable delay happen in a real-time circuit-switched voice transmission? It can happen if a voice stream is transmitted via two different transmission channels A and B, while the transmission time via B is longer than via A. At the far end there is one receiver and its input can be switched from A to B and vice versa. In consequence there is a jump backwards and forwards in the voice stream. Those jumps can even happen in traditional 2G or 3G networks when an inter-cell or inter-technology handover is performed. The voice stream is delivered via the two transmission paths before the handover takes place; however the synchronization is based on frame borders, not on its sequential number. It may happen that a frame’s content gets repeated or skipped at once. In those (rare) cases, we can observe variable delay even in circuit-switched mobile channels.

In packet-switched networks the occurrence of variable delay is much higher than with circuit switched networks. The typical belief is that this is just a consequence of jitter in the packet reception time, which is true in principle. However, none of today’s receivers will play out the received voice frames in the exact order they arrive. Common practice is the use of a packet buffer, where incoming packets will be stored in a pile and the decoder takes one after another in the defined time steps. What is observed as variable delay is just uncompensated packet delay jitter and how it affects the voice stream, this is different from the packet jitter itself.

10001010010010100111

10110100100101001101

10110100100101001101

10110100100101001101

10110100100111101101

10110100101111101101

10110101100101001101

10000100100101001111

10110111100000001101

10110110101111001101

Jitter Buffer Decoder

Figure 11: Basic principle of a jitter buffer to compensate packet delay jitter

In case of very long packet delay or packet loss, it may happen that the buffer runs empty (buffer underrun). In this case, the decoder has no more information to decode. Clever strategies are used to solve this issue. Intelligent decoders recognize an up-coming under-run and try to extend speech pauses to empty the buffer slower in the hope that new packets are received in the meantime. If there are no speech pauses, the speech information itself gets stretched, either by time-warping or by creating additional voice frames similar to the replacement strategies in voice decoders. All these smoothing strategies try to gain time by bridging long delays of the next expected packet. However, these strategies can just bridge a limited time when the buffer runs empty. The worst case is that the smoothing strategies reach their own limits and voice is inter-rupted due to excessive buffer underrun. The decoder simply has nothing to work with. It is logical that all these issues are less common with the use of a larger jitter buffer. Conversely, these issues will be more

Page 15: Benchmarking of VoLTE Services A first field experience · A Rohde & Schwarz Company Benchmarking of VoLTE Services A first field experience March 2015. SwissQual AG Allmendweg 8

Benchmarking of VoLTE Services

A first field experience

© 2000 - 2015 SwissQual AG

Chapter 3 | Testing and Benchmarking VoLTE

CONFIDENTIAL MATERIALS

11

common if the jitter buffer is short. There is a direct relation to the overall audio delay as discussed in the previous section. A long jitter buffer can avoid warping the voice signal but it has to pay with a higher audio delay in general.

However, each strategy for stretching the voice stream or even pausing accumulates more and more delay in the signal. Once packets are received again and the buffer gets filled, decoders try to get rid of the accumulated delay again. Usually, speech pauses get shortened for that.

Effects on the voice are very different. Extending or shortening speech pauses are almost imperceptible. They do not change the speech signal itself. It is different if the voice signal itself gets stretched or compressed, which influences the talk spurts, they are getting longer or shorter. A pure extension or compression as a re-sampling will change the spectral distribution too, since the voice pitch gets higher or lower and may sound unnatural. Therefore, time-warping with pitch preservation is used. This way a moderate stretching or compression of the voice becomes much more acceptable.

In consequence, packet delay jitter widely becomes compensated fully; uncompensated delays can be – to some extent – masked by smart voice processing. Finally, uncompensated packet delay jitter affects the temporal structure of the voice stream and can be measured physically, but its effects on speech quality depend on the used processing strategy in the decoder and can only be evaluated with a speech quality algorithm.

Measuring and evaluation of variable audio delay

The following measurements illustrate the occurrence of variable audio delay. Variable audio delay can only usefully be measured if a voice stream is transferred. To emulate talk spurts as in a real conversation, human speech consisting of words forming sentences must be used. Only in this case the relation and occurrence of speech pauses and active spurts is as in a human conversation.

Variable audio delay can be physically measured when sub-dividing the original signal into short segments, looking for each segment in the received signal and measuring the delay. The delay can be given as an absolute value in between the segments or – more illustrative – relative to a mean of the evaluated talk spurt or group of sentences.

For those measurements an intermediate result structure of ITU-T P.863 ‘POLQA’ can be used. POLQA aligns each speech segment of the received signal to a corresponding segment in the reference. The positions of the segments in the signals can be used as measure for variable delay. In case all segments have the same offset in their positions, the delay is constant for all segments. If there is no constant offset, we have variations in the delay. Figure 12 shows an extreme example of variable delay. The audio signal on the top line is the reference audio clip. On the lower line – the received audio clip - there are extended pauses and at the end a skipped part of speech. The red audio from the reference clip is the part that was skipped. The white areas in the received clip show the parts where pauses have been extended.

Figure 12: Example of an aligned pair of reference and degraded signal

This example is directly taken from the time-alignment procedure of POLQA. In consequence, POLQA can deliver information about the occurrence and the amount of variable delay too.

Page 16: Benchmarking of VoLTE Services A first field experience · A Rohde & Schwarz Company Benchmarking of VoLTE Services A first field experience March 2015. SwissQual AG Allmendweg 8

Benchmarking of VoLTE Services

A first field experience

© 2000 - 2015 SwissQual AG

Chapter 3 | Testing and Benchmarking VoLTE

CONFIDENTIAL MATERIALS

12

The following measurements in real VoLTE and 3G calls show the range of delay variations in 6s talk spurts, it means the difference from the minimum to the maximum of measured short-term delays.

The following detailed pictures illustrate the relative delay during a transmitted speech sample in a VoLTE channel (Figure 13). It can be seen that the delay increases (stepwise) during the first sentence and gets adjusted at the end of the speech pause again.

Figure 13: Example of variable delay in a VoLTE live test sample

In CS 3G voice channels delay changes are almost impossible, but can happen in conjunction with inter-cell handovers. The example in Figure 14 shows a change in delay of 20ms that is one AMR frame.

Figure 14: Example of variable delay in a 3G live test sample

However, most delay changes in circuit-switched networks happen if inter-technology handovers and especially if changes from AMR-WB transcoding-free to regular AMR occur.

Usually, such a handover causes an interruption in the voice stream. If this interruption didn’t occur it may lead to a perceptible ‘glitch‘, since the coded and differential voice information of the two subsequent received channels does not match. Thus a listener does not perceive the delay variation in this case rather the desynchronized decoder or just a gap.

The following picture shows the occurrence of delay changes for the two sets, VoLTE and 3G. The data is the same as used for the analysis in section ‘Audio quality in VoLTE’ obtained in summer 2014.

Page 17: Benchmarking of VoLTE Services A first field experience · A Rohde & Schwarz Company Benchmarking of VoLTE Services A first field experience March 2015. SwissQual AG Allmendweg 8

Benchmarking of VoLTE Services

A first field experience

© 2000 - 2015 SwissQual AG

Chapter 3 | Testing and Benchmarking VoLTE

CONFIDENTIAL MATERIALS

13

Figure 15: Occurrence of delay changes for VoLTE and 3G live network samples

In 3G almost all samples (94%) have no delay changes, the threshold of 20ms corresponds to one AMR frame and there are only 6% measurements above that value. For VoLTE the picture is different, about 70% of the samples show delay changes of >20ms.

But does this variable delay matter from the user perspective in terms of audio quality? This analysis cannot be done one to one, since the test samples are also affected by other live distortions too. Especially for 3G, delay changes are usually combined with handovers and potential gaps and short muted speech parts which also have an obvious effect on MOS values.

However, an indication of speech quality degradation can be derived if the MOS scores are analysed separately for different amounts of variable delay. The diagram in Figure 16 shows the P.863 averages for samples with a dedicated range of delay changes vs. time warping.

Figure 16: P.863 MOS-LQO statistics in relation to delay changes in the test sample

For VoLTE with variable delay the situation is quite stable. Even if we can measure a variable delay physically in the signal, it hardly decreases the speech quality. This underlines that the jitter compensation strategies as focusing on speech pauses and moderate stretching with pitch-preservation can maintain a good audio quality.

Page 18: Benchmarking of VoLTE Services A first field experience · A Rohde & Schwarz Company Benchmarking of VoLTE Services A first field experience March 2015. SwissQual AG Allmendweg 8

Benchmarking of VoLTE Services

A first field experience

© 2000 - 2015 SwissQual AG

Chapter 4 | Conclusion

CONFIDENTIAL MATERIALS

14

4 Conclusion

Voice over LTE is becoming today’s reality in more and more live networks. Within this transition period a careful benchmarking of the VoLTE services itself as well as of legacy technologies is extremely important to guarantee a quality standard that is comparable to well-known 2G and 3G services. For quite a while both technologies will be used in parallel. It is not only important to benchmark the technologies to each other but also to benchmark cross-technology connections. This is the daily experience for the majority of VoLTE users’ calls and in the near future VoLTE to CS interconnections will be the majority of mobile calls in general.

This transition period may last the next few years and requires a dense supervision by benchmarking and optimization teams. Measurements on the application layer using POLQA help to guarantee the best voice call experience. As can be seen in the data presented above, optimizing certain elements such as jitter buffers can help improve conversation quality and when done correctly, it can do this without a terrible compromise to audio quality. This report gives first example values for voice call performance and highlights the trade-off between speech quality and audio delay. The results indicate that VoLTE may provide the best voice call performance overall, appearing to be better than today’s 3G networks.

However, the integration of VoLTE functionality into handsets plays an important role in the end user voice experience, much more than in the past. Benchmarking systems used to test VoLTE networks must use commercially available devices designed for use on the network they are testing. It is also critical for the measurement system to emulate real world voice call use cases along with having the full complement of RF (Layer 1, 2, and 3), SIP, IMS, and RTP, and all other TCP/IP measurement capabilities to really understand what is behind good or bad network performance. This will ensure that network operators will have a true image about the user’s satisfaction when using VoLTE.

The collected and analyzed results in this study show already impressive and rapid improvements made within the months from early commercial launch. The data collection was made in a period where one operator has already changed to transcoding-free AMR-WB processing across technologies while others have still to do this step. The recognition of this advantage could be done on the application layer by measuring high quality in wide audio bandwidth in combination with a short transmission delay. Both are valid indicators that this technology is enabled and will live up to expectations.

Conversely, the road ahead to a 100% IP based communications world will be full of pitfalls. The upcoming change over from 2G/3G systems to all IP based LTE is happening now and with that there will be more re-farming of spectrum, continued explosion of LTE bands deployed, carrier aggregation, WiFi offloading, etc. etc. As network operators strive to deliver all the promises of high speed data networks, it will be critical for them to keep their eye on voice performance. Nothing could be more damaging than customer churn created by a poor VoLTE experience that is only consuming 12kpbs of data bandwidth.