White Paper – Performance-Assured Ethernet
May 2012 | Rev 1.0
Performance-Assured Ethernet Prepared by Stan Hubbard -‐ Senior Analyst, Heavy Reading on behalf of Accedian Networks and Cyan Inc.
Executive Summary
Applications ranging from cloud connectivity to wireless backhaul increasingly rely on high-‐performance Ethernet services. Service provider operations and enterprise IT managers charged with managing the performance of services need accurate performance monitoring instrumentation to measure the network using standard service operations, administration and maintenance (OAM) measurements for one-‐way delay (latency), one-‐way delay variation (jitter) and frame loss. Accurate instrumentation, combined with performance management systems that collect, summarize and present network performance data in the form of actionable business intelligence, are becoming required tools for high-‐performance services.
This white paper explores critical service and operational requirements from leading Ethernet service providers, discusses standards-‐based approaches for measuring and reporting Ethernet service performance in a manner that provides a common language for service providers and end users globally, and examines key considerations for rolling up this performance assurance data in a manner that provides actionable business intelligence for service providers and end users. The paper also explores some of the broader planning, management and end-‐to-‐end interoperability considerations for delivering performance-‐assured Ethernet (PAE) services in the reality of today's modular, multi-‐vendor networks.
This white paper complements a recent webinar on the same topic hosted by Light Reading and Heavy Reading. An archive of the webinar can be found at: http://www.lightreading.com/webinar.asp?webinar_id=29978.
CE 2.0 Requires Performance Assurance
Faced with intense competition and operating cost pressures, Ethernet service providers worldwide have been looking for service management tools that will help them deploy services faster and more efficiently, provide greater visibility into service and network performance on an end-‐to-‐end basis, quickly resolve service problems, and improve the overall customer experience. They have encouraged and supported the development of OAM technology designed to address management concerns throughout the service lifecycle.
Performance assurance and other aspects of service management are only going to become more important as operators transition from Metro Ethernet Forum-‐defined carrier Ethernet (CE) 1.0 networks and services toward CE 2.0 networks and services. Whereas CE 1.0 services generally have involved limited management on a single provider network, higher-‐performance CE 2.0 services are characterized by extensive, advanced manageability and multiple classes of service over potentially multiple interconnected provider networks.
The shift to CE 2.0 also means an increased emphasis on service-‐level agreements (SLAs), including the potential for more demanding SLA-‐related requests from retail and wholesale customers. Anecdotal information suggests SLAs are emerging as a pain point for many Ethernet service providers, because some of their customers are asking them to commit to certain performance levels before a service is deployed and tested. This means operators will not only be looking for robust management of active services, but also welcome tools that enable them to model and predict service performance at the front end of the service lifecycle.
Performance assurance is important for a number of high-‐growth Ethernet services and Ethernet-‐based applications, including mobile backhaul, Ethernet business services, cloud connectivity, high-‐frequency trading, video delivery and tele-‐protection for utilities. Let's look briefly at two of the most important applications below: mobile traffic backhaul and high-‐performance Ethernet business services.
White Paper – Performance-Assured Ethernet | May 2012 | Rev 1.0
Mobile Backhaul Performance Assurance
Mobile service providers are rushing to expand their backhaul networks, and carrier Ethernet has emerged as the technology of choice to affordably handle the explosion in their mobile data traffic. In many cases, wireless providers are outsourcing backhaul to incumbent telecom operators or alternative access vendors. Outsourcing requires mobile providers to successfully monitor the services that they are buying to ensure the highest quality of experience for mobile users.
Increasingly, RFPs and master service agreements of mobile providers are requiring not only instrumentation to monitor network performance, but also Web-‐based portals for active and accurate reporting of data in near-‐real time, as well as on a monthly basis. Beyond reporting, operators are also looking for performance assurance solutions that can help them minimize latency that could negatively impact LTE services.
Ethernet Business Service Performance Assurance
While Ethernet business services remain the fastest-‐growing data services opportunity in the wireline market, providers of retail services face intensifying competition due to a growing number of players with overlapping service coverage. Many service providers have embraced PAE solutions as a way to differentiate their service portfolios, accelerate service velocity and improve the customer experience especially with premium, high-‐touch services.
Heavy Reading expects that interest in PAE will only grow over time as retail providers look to upgrade their portfolios in line with the shift to CE 2.0 and evolve their networks to support on-‐demand cloud services. Applications that were once running locally in enterprise networks are now moving to the cloud. Latency and reliable throughput will be essential. Businesses will require methods for monitoring the performance of their network services as they become a critical success factor for business operations to run seamlessly.
Service Management Is Business-‐Critical
Feedback from senior service provider members of the global Ethernet Executive Council indicates that performance and fault management are becoming increasingly important for a growing number of companies that offer Ethernet business services and wholesale services sold to retail service providers or mobile operators. More than 80 percent of Council members who participated in Heavy Reading's 4Q11 state-‐of-‐the-‐industry survey said that they believe the ability to provide end-‐to-‐end SLA performance guarantees is an important, very important, or critical differentiator in today's market.
Reliance Globalcom's VP of Network Architecture summed up why operators value robust service management when he stated the following at Ethernet Expo Americas in late 2011:
Service management is business-‐critical. Our guys have got to be able to assure the service. They've got to have visibility. They've got to be able to answer customer questions… which is why we were a very early adopter of 802.1ag and Y.1731.
Imagine an Ethernet network where you couldn't tell what was talking to what. You couldn't tell what your SLA was… How do you tell whether there was an issue and what the issue was? Customers are, of course, the first to call you and say there is an issue. But unless we have the ability to look at all the infrastructure, look at all of the customer sites, look at what is happening to the traffic at these sites, we have a tough time correlating what is happening in the network to what the customer is experiencing. What you cannot measure, you cannot manage. What you cannot see, you cannot show. [With OAM] we now find that we can resolve issues in a timely manner.
Performance-‐Assured Ethernet's Key Elements
Colt, Reliance Globalcom and other leading service providers have welcomed the emergence of PAE solutions that utilize the latest OAM technologies, integrate well into a multi-‐vendor environment, and help speed delivery, increase service value and enhance customer experience, while keeping capital and operational costs under control.
White Paper – Performance-Assured Ethernet | May 2012 | Rev 1.0
Available PAE solutions now go well beyond basic reporting tools. Embodying the concept of CE 2.0, today's PAE solutions offer advanced capabilities that take carrier Ethernet to the next level by facilitating deterministic performance with absolute quality of service that can be matched to the requirements of particular applications. PAE offers greater levels of performance than CE 1.0, even with increased levels of service density.
The principal characteristics of PAE include the ability to:
1. Plan and predict service performance. 2. Verify connectivity and the performance of a circuit at the time that it is activated. 3. Validate end-‐to-‐end SLA compliance throughout the life of the circuit. 4. Engineer the multi-‐layer network and proactively manage faults to deliver consistent
performance. 5. Visualize and report on service and application performance on an on-‐going basis.
Plan & Predict Performance
Network planning sets the stage for all subsequent phases of performance objective fulfillment. When planning the physical, optical and Ethernet layers of the network, the performance objectives of a potentially wide range of applications must be considered. Performance constraints will play a part in network design choices, such as the length of fiber runs, the speed of links, the routing of primary and protected paths, and the number of Ethernet switching hops (optical-‐electrical-‐optical) versus optical switching points (optical-‐optical-‐optical).
There are now tools on the market that facilitate multi-‐layer network planning and designing and also enable operators to predict the network's performance potential. For example, latency can be modeled between any entry and exit point on the network by looking at factors from the fiber routes to the latency characteristics of individual nodes.
With newly available PAE tools, service providers are able to predict how well services will perform not only on an individual basis, but also on an aggregated basis. This gives them deeper insight into the network's capabilities before an SLA commitment is ever made to a customer.
“Customers want to buy a solution or a service that encompasses an end-‐to-‐end SLA, and not 10 different single SLAs that use 10 different reporting portals that all look different. What our customers expect going forward is a single end-‐to-‐end SLA for their solution as well as comprehensive reporting data that cover both the network side as well as the application. Performance-‐assured Ethernet, to us, is one of those building blocks that will enable us to deliver such an end-‐to-‐end SLA. As well, it is going to help us to bring much more network intelligence into the applications.”
– Director of Network Strategy & Architecture, Colt
White Paper – Performance-Assured Ethernet | May 2012 | Rev 1.0
Figure 1: Modeling & Predicting Latency
Source: Cyan
Verify Performance at Service Activation
The next step in PAE is to verify that an Ethernet circuit is properly configured and that all key performance indicators (KPIs) or SLA parameters such as throughput, frame loss, latency and jitter are met when the circuit is activated. After a short configuration test, an initial service performance test validates the quality of the service and measures the SLA parameters. Remote diagnostic tools help operators work on a circuit, if needed, until it meets the required performance. A "service birth certificate" is typically generated at this stage and can be referenced at a later date to examine how service performance might have changed over time.
Many Ethernet service providers have traditionally used RFC 2544 to conduct turn-‐up tests, but the new ITU-‐T Y.1564 service activation test methodology that emerged in 2011 is more favorably aligned with the shift to CE 2.0 services. Y.1564's next-‐generation test capabilities promise to provide more accurate validation and enable faster deployment and troubleshooting compared to RFC 2544 testing.
Validate End-‐to-‐End SLA Compliance Over Time
One of the most important technology-‐related developments in the Ethernet market in recent years has been the adoption of ITU-‐T Y.1731 performance monitoring that allows operators to continuously measure frame loss, latency and jitter on an end-‐to-‐end basis throughout the life of a circuit. Performance statistics are rolled up into online service portals, the most advanced of which provide visibility on essentially a real-‐time basis.
With PAE, Y.1731 OAM-‐enabled network devices or dedicated probes are used to inject monitoring frames in-‐band with Ethernet services and traffic statistics are gathered using one-‐way metrics codified in the MEF 10.2 technical specification that covers Ethernet service attributes. The ability to gather one-‐way metrics is necessary in order to obtain an accurate picture of service performance in today's networks, which are typically characterized by asymmetric traffic flows in which as much as 80 percent of traffic comes from the core (due to popular applications such as over-‐the-‐top video and live video streaming).
To make accurate one-‐way measurements, the test points have to be synchronized. Network timing clocks need to be aligned to the exact time of day to within a tolerance of microseconds in order to measure one-‐way delays that typically are measured in <10 millisecond speeds. Many operators have deployed PAE network interface devices (NIDs) built with hardware-‐based OAM that is specifically designed to handle the synchronization requirements associated with one-‐way
White Paper – Performance-Assured Ethernet | May 2012 | Rev 1.0
Figure 3: Quality of Service Enforcement
Source: Cyan
measurements. These operators have turned to NIDs because much of the equipment currently deployed in their networks had OAM features added after the equipment originally shipped. Much of the installed equipment thus lacks the fundamental hardware capabilities required for accurate
Proactively Manage Faults & Deliver Consistent Performance
Proper execution of fault management and traffic management is as critical to achieving performance goals as planning, provisioning and performance measurement.
PAE solutions use IEEE 802.1ag OAM to perform end-‐to-‐end connectivity fault management across one or more carrier networks. 802.1ag functions on a per-‐service VLAN or per-‐Ethernet Virtual Connection (EVC) basis. This helps operators detect when a customer service is down using proactive continuity check messages, verifies the loss of service connectivity by using loopback "ping" messages, isolates the service connection failure by using link trace messages, and reports end-‐to-‐end connectivity faults.
As a complement to 802.1ag, the MEF has developed service OAM specifications aligned with the shift to CE 2.0 services. MEF 30 defines a framework for service OAM that provides mechanisms to detect, verify, isolate and report end-‐to-‐end connectivity faults, while MEF 31 defines a management information base for multivendor fault detection and troubleshooting.
On the traffic management front, the network must do its part to control latency and jitter and to maximize throughput in order to meet stringent performance goals. This can be achieved by implementing connection-‐oriented Ethernet (COE) with deterministic levels of performance in transport and switching equipment. COE includes quality of service (QoS) mechanisms that are service-‐aware and have low-‐latency queuing and scheduling capabilities. Meanwhile, connection admission control is needed to prevent over-‐provisioning of guaranteed services, including protection paths.
Figure 2: One-Way Metrics Require Synchronized Test Points
Source: Accedian Networks
White Paper – Performance-Assured Ethernet | May 2012 | Rev 1.0
Figure 4: Cloud-Based Web Portal Tool Examples – Accedian Networks’ VisionMETRIX & Cyan’s CyPortal
Sources: Accedian Networks and Cyan
The implementation of multi-‐layer fault management and ultra-‐fast protection switching also contributes to the achievement of performance goals. Ethernet services today are delivered over a multi-‐layered transport network that may include not only Ethernet switching, but also potentially OTN and WDM optical layers. All these transport layers play an important role in meeting SLA goals for the Ethernet services that run over them. For example, multi-‐layer management techniques can identify issues such as optical impairments and allow the service provider to take corrective action before Ethernet service performance is negatively affected.
Visualize Service & Application Performance
The proper reporting of performance data is essential to leveraging it for business value. The quality of reporting is influenced by its content, as well as where and how it is reported.
SLA compliance reports often include both real-‐time and historical components. Real-‐time reporting can give service providers or customers a compliance snapshot that may be useful if the underlying application is experiencing problems. Historical reports can include rolling SLA compliance information in order to identify trends.
Performance management tools typically include the ability to configure threshold crossing alerts (TCAs) for performance parameters. Real-‐time TCA notification is important to enable fast corrective action, and reporting TCA events over a time period can also be very useful for macro-‐level remediation of systemic problems.
A growing number of service providers such as tw telecom, Colt, Level 3, Reliance Globalcom, KPN International and others have rolled out Web service portals to provide end-‐to-‐end performance visibility for their customers. And PAE solutions providers such as Accedian Networks and Cyan now offer cloud-‐based performance monitoring and reporting tools that support multi-‐vendor environments.
The ability to offer end-‐to-‐end performance visibility via a portal that leverages the statistics provided by PAE devices has been described by a senior product management expert at Level 3 as a "very, very large step change for the industry." This development has been welcomed by customers and service providers alike.
For end customers, online statistics can instill confidence that they are getting the level of service performance they expect and that their service provider has the necessary insight to quickly resolve an issue, should one emerge. Real-‐time monitoring also enables customers to rapidly determine if the network is the source of any application problems they are experiencing. This can help reduce finger-‐pointing and speed problem resolution.
White Paper – Performance-Assured Ethernet | May 2012 | Rev 1.0
“It’s good to give the customer a touch point… The customer has a dashboard and you have your proof point. The portal is important if only to prove what you are actually delivering…All of the sudden the customer starts seeing the value and starts understanding what the network is bringing to him… and, yes, the customers pay extra for that. They’re actually paying for the professional service on top.”
– Product Manager, NGN Services, KPN International
End customers can also use the performance data for business decision-‐making that is likely to primarily revolve around how to best utilize network services for their applications. They might decide to upgrade to higher CIR or more premium low-‐latency offerings. They also might use the information to gain confidence with a service in order to migrate additional applications to a service provider instead of using a private network.
For service providers, online data can help them stay ahead of the curve in addressing performance challenges, enhance their service wrap and develop deeper relationships with customers based on a greater understanding of their traffic flows and application-‐specific requirements. Performance data can also be used internally for better capacity planning or performance optimization.
Many operators report that customers want to have access to performance data and increasingly expect this from service providers. Rather than charging separately for portals, most operators appear to be bundling portals into their premium service offerings as a way to enhance the customer experience. As illustrated by the quote at left, operators may also find opportunities to generate additional revenue by leveraging portal information to engage in a consultative relationship with customers.
White Paper – Performance-Assured Ethernet | May 2012 | Rev 1.0
Next Steps for Performance-‐Assured Ethernet
Many carrier experts have told Heavy Reading that they are pleased with innovations in Ethernet service OAM and PAE that have begun to make a positive impact in controlling operating costs and improving customer satisfaction, but they still believe that more needs to be done to make sure that OAM works well across the portfolio of a single vendor, across equipment from multiple vendors, and across networks of multiple operators. For example, the CTO of XO noted in late 2011 that implementations of a variety of OAM protocols are improving and interworking, but the end-‐to-‐end service management "still has a ways to go," including with various network elements from a single supplier.
This sentiment was echoed by the Head of Transport Solutions at Deutsche Telekom ICSS, who stated, "Deployment of equipment from different vendors creates a lot of work in terms of management. There is really no smooth and harmonized seamless OAM available among these vendors. This is something which is of high interest to all of the carriers."
While the MEF is striving to improve inter-‐carrier cooperation on service management, the reality in the near term is that Ethernet service providers and mobile service providers will often deploy their own PAE platforms at customer sites to gain end-‐to-‐end performance visibility rather than trying to rely on OAM harmonization with wholesale partners.
Going forward, PAE instrumentation will evolve beyond NIDs into other network elements to meet the OAM gap existing in today's networks. Cyan has integrated OAM functionality into the silicon of its packet-‐optical transport platforms, which support the full suite of MEF services.
Accedian Networks is integrating PAE functionality with support for G.8032v2 protection and EVC add/drop capability to create a simpler alternative to using carrier Ethernet switch/routers for delivering resilient Ethernet services over optical ring network topologies. Accedian has also developed what it calls "virtual NID" technology that can be embedded in third-‐party network elements to obtain accurate one-‐way measurements of KPIs that are seamlessly reported as if there was an actual hardware NID deployed in the network.
Performance-‐assured Ethernet is an important growth engine for future services. As service providers embrace PAE solutions, competitive forces will cause adoption to accelerate. Higher adoption rates combined with the transparency that PAE provides will drive improvement in the overall quality of Ethernet services. This in turn will result in better performing applications and increasing utilization of the network to support even more services.
© 2012 Accedian Networks Inc. All rights reserved. Accedian Networks, the Accedian Networks logo, High Performance Service Assurance, Performance Assurance Agent, MetroNID, EtherNID, MetroNODE 10GE, Fast-‐PAAs, PAA, SLA-‐Meter, Plug & Go, Multi-‐SLA, Traffic-‐Meter, Vision EMS, VisionMETRIX, V-‐NID are trademarks or registered trademarks of Accedian Networks Inc. All other company and product names may be trademarks of the respective companies. Accedian Networks may, from time to time, make changes to the products or specifications contained herein without notice. Some certifications may be pending final approval, please contact Accedian Networks for current certifications.
Accedian Networks Inc.
2351 Alfred-‐Nobel, Suite N-‐410
St-‐Laurent (Montreal), Quebec, Canada, H4S 2A9
Toll free: 1-‐866-‐685-‐8181