wms product engineering guide

159
UMT/OAM/APP/024291 Alcatel-Lucent Wireless Management System WMS Product Engineering Guide LR14.xW 05.01 / EN Preliminary June 2014

Upload: hung6715

Post on 01-Oct-2015

34 views

Category:

Documents


20 download

DESCRIPTION

WMS Product Engineering Guide-Alcatel Lucent

TRANSCRIPT

  • UMT/OAM/APP/024291

    Alcatel-Lucent Wireless Management System WMS Product Engineering Guide LR14.xW 05.01 / EN Preliminary June 2014

  • Alcatel-Lucent Wireless Management System

    WMS PRODUCT ENGINEERING GUIDE

    Document number: UMT/OAM/APP/024291 Document issue: 05.01 / EN Document status: Preliminary Product Release: LR14.xW Date: June 2014

    2009-2013 Alcatel-Lucent

    All rights reserved.

    UNCONTROLLED COPY: The master of this document is stored on an electronic database and is write protected; it may be altered only by authorized persons. While copies may be printed, it is not recommended. Viewing of the master electronically ensures access to the current issue. Any hardcopies taken must be regarded as uncontrolled copies.

    ALCATEL-LUCENT CONFIDENTIAL: The information contained in this document is the property of Alcatel-Lucent. Except as expressly authorized in writing by Alcatel-Lucent, the holder shall keep all information contained herein confidential, shall disclose the information only to its employees with a need to know, and shall protect the information from disclosure and dissemination to third parties. Except as expressly authorized in writing by Alcatel-Lucent, the holder is granted no rights to use the information contained herein. If you have received this document in error, please notify the sender and destroy it immediately.

  • Alcatel-Lucent Publication history i

    UMT/OAM/APP/024291 20092013AlcatelLucent

    PUBLICATION HISTORY

    June2014Issue 05.00 / EN, Draft

    - Engineering information as part of the LR14.xW main features:

    o 173536 - WMS basic support of 9771 RNC o 174225 - WMS support of 9771 RNC tenant management o 173330 PM support of LR14.W o 168642 - iBTS Resource Tracking using PM Counters o 173331 - WMS LR14.W Call trace support o 175776 - LR14W support of Solaris 10 update 11 o 173334 - WMS 3rd party update o 173329 - LR14 - WMS LR14.W KPI dimensioning

    - Introduction of WCE Platform Manager

    - Update of SAM release supported

    - Removal of extension hardware for M5000 and addition to the nominal hardware configuration

    - Update Symantec Netbackup version supported

    June2014Issue 05.01 / EN, Preliminary

    - Post Review

  • Alcatel-Lucent ii

    UMT/OAM/APP/024291 20092013AlcatelLucent

    TABLEOFCONTENTS

    1. ABOUTTHISDOCUMENT.......................................................................................................91.1. AUDIENCE FOR THIS DOCUMENT ................................................................................. 91.2. NOMENCLATURE ........................................................................................................ 91.3. SCOPE ...................................................................................................................... 91.4. REFERENCES .......................................................................................................... 10

    2. OVERVIEW...........................................................................................................................112.1. NETWORK MANAGEMENT FUNCTIONALITY ................................................................ 11

    3. WMSMAINSERVERENGINEERINGCONSIDERATIONS..........................................................143.1. NSP OVERVIEW ...................................................................................................... 143.2. FAULT MANAGEMENT APPLICATIONS ........................................................................ 173.3. FAULT AND CONFIGURATION MANAGEMENT .............................................................. 183.4. SRS FUNCTIONALITY ............................................................................................... 213.5. PERFORMANCE MANAGEMENT APPLICATION ............................................................ 213.6. CAPACITY ................................................................................................................ 233.7. BACKUP AND RESTORE ............................................................................................ 533.8. INTEGRATION OF WMS TO A STORAGE AREA NETWORK (SAN) ................................. 573.9. OAM SERVER RESOURCE MONITORING .................................................................... 59

    4. WMSEXTERNALINTERFACEENGINEERINGCONSIDERATIONS.............................................614.1. OVERVIEW............................................................................................................... 614.2. THE ALCATEL-LUCENT SECURITY BUILDING BLOCK .................................................. 614.3. THE 3GPP NOTIFICATION BUILDING BLOCK .............................................................. 624.4. 3GPP FAULT MANAGEMENT BUILDING BLOCK (3GPP FM BB) ................................ 624.5. 3GPP BASIC CM BUILDING BLOCK (3GPP BASICCM BB) ........................................ 634.6. 3GPP BULKCM BUILDING BLOCK (3GPP BULK CM BB) ........................................... 644.7. 3GPP PM BUILDING BLOCK (3GPP PM BB) ............................................................ 654.8. 3GPP BUILDING BLOCK DEPLOYMENT ..................................................................... 664.9. 3GPP EXTERNAL INTERFACE CAPACITY AND PERFORMANCE .................................... 67

  • Alcatel-Lucent iii

    UMT/OAM/APP/024291 20092013AlcatelLucent

    4.10. 3GPP PM BB EXTERNAL INTERFACE ....................................................................... 694.11. CHANGES IN 3GPP MOI (DN) ALIGNMENT ............................................................... 694.12. OSS REMOTE LAUNCH OF WMS GUI ....................................................................... 704.13. WMS EAST-WEST INTERFACE ................................................................................. 70

    5. WMSCLIENTSANDSERVEROFCLIENTSENGINEERINGCONSIDERATIONS............................735.1. WMS CLIENT CAPACITY .......................................................................................... 735.2. WMS CLIENT USER PROFILE .................................................................................... 745.3. CLIENT ENGINEERING CONSIDERATIONS ................................................................... 755.4. WMS SERVER OF CLIENTS ENGINEERING CONSIDERATIONS ..................................... 75

    6. SOLUTIONHARDWARESPECIFICATIONS..............................................................................776.1. OVERVIEW............................................................................................................... 776.2. SERVER HARDWARE SPECIFICATIONS ...................................................................... 786.3. CLIENTS HARDWARE SPECIFICATIONS ...................................................................... 796.4. GENERAL AVAILABILITY OF ORACLE/SUN SERVERS .................................................. 826.5. DCN HARDWARE SPECIFICATIONS ........................................................................... 886.6. OTHER EQUIPMENT ................................................................................................. 92

    7. NETWORKARCHITECTURE...................................................................................................937.1. DEFINITION - NOC / ROC ARCHITECTURE ................................................................ 937.2. REFERENCE ARCHITECTURE ............................................................................ 937.3. FIREWALL IMPLEMENTATION ..................................................................................... 947.4. NETWORK INTERFACES ON WMS SERVER ................................................................ 957.5. WMS SERVER IP ADDRESS REQUIREMENTS .......................................................... 1047.6. NETWORK INTERFACES ON CLIENTS ....................................................................... 1047.7. NETWORK INTERFACES FOR REMOTE ACCESS ....................................................... 1057.8. OTHER NETWORKING CONSIDERATIONS ................................................................. 106

    8. BANDWIDTHREQUIREMENTS............................................................................................1108.1. BANDWIDTH CONSIDERATIONS WITHIN THE ROC ................................................... 1108.2. BANDWIDTH REQUIREMENTS BETWEEN THE ROC AND THE NES ............................ 1118.3. BANDWIDTH REQUIREMENTS BETWEEN THE ROC AND THE CLIENTS ....................... 1168.4. BANDWIDTH REQUIREMENTS BETWEEN THE ROC AND EXTERNAL OSS ................... 117

  • Alcatel-Lucent iv

    UMT/OAM/APP/024291 20092013AlcatelLucent

    9. SECURITYANDREMOTEACCESS........................................................................................1189.1. AUTHORIZATION, AUTHENTICATION AND SESSION MANAGEMENT .............................. 1189.2. SYSTEM SECURITY ................................................................................................ 1279.3. SECURITY TRANSMISSION AND RADIUS ................................................................. 130

    10. NETWORKTIMESYNCHRONISATION.................................................................................13610.1. ABOUT NTP FUNCTIONALITY .................................................................................. 13610.2. COMPATIBILITIES ................................................................................................... 13710.3. TIME SOURCE SELECTIONS .................................................................................... 13710.4. REDUNDANCY AND RESILIENCY .............................................................................. 13710.5. DEFAULT BEHAVIOUR OF WMS MAIN SERVER UNDER OUTAGE CONDITIONS ............ 13810.6. RECOMMENDED NTP ARCHITECTURE .................................................................... 13810.7. USING PUBLIC TIME SOURCES OVER THE INTERNET ............................................... 13910.8. NTP ACCURACY AND NETWORK DESIGN REQUIREMENTS ........................................ 14010.9. NTP RESOURCE USAGE CONSIDERATIONS ............................................................ 141

    11. WCEPLATFORMMANAGER...............................................................................................14311.1. OVERVIEW............................................................................................................. 14311.2. HARDWARE ........................................................................................................... 14411.3. NETWORK INTERFACES AND IP ADDRESSING .......................................................... 145

    12. 5620SAM..........................................................................................................................14612.1. OVERVIEW............................................................................................................. 14612.2. HARDWARE ........................................................................................................... 14612.3. SOFTWARE ............................................................................................................ 14612.4. FAULT MANAGEMENT OF 77XX-SR ON WMS .......................................................... 14712.5. OSS NORTHBOUND INTERFACE ............................................................................. 148

    13. ANNEXES...........................................................................................................................14913.1. OBSERVATION FILES .............................................................................................. 14913.2. NE SOFTWARE ...................................................................................................... 152

    14. ABBREVIATIONS................................................................................................................154

  • Alcatel-Lucent v

    UMT/OAM/APP/024291 20092013AlcatelLucent

    LISTOFFIGURESFigure 1 : NSP Overview ..................................................................................................................................... 15Figure 2 : Fault Management Architecture.................................................................................................... 18Figure 3 : Configuration Management Architecture ................................................................................... 19Figure 4: Performance Management Architecture...................................................................................... 22Figure 5: Dual Main Server configuration .................................................................................................... 26Figure 6 : Light Sysmon architecture ............................................................................................................. 29Figure 7 : Overall architecture and communication channels ................................................................ 43Figure 8 : Overall Deployment and Communication Channels ............................................................... 50Figure 9 : Firewall change to implement RTCT ............................................................................................ 51Figure 10: Storage Area Network Architecture ............................................................................................ 58Figure 11: 3GPP FM High Level architecture ............................................................................................... 63Figure 12 : Basic CM/Kernel CM High Level architecture ........................................................................ 64Figure 13 : Bulk CM High Level architecture ................................................................................................ 65Figure 14 : PM High Level architecture .......................................................................................................... 66Figure 15 : 3GPP Output Building Block Deployment within a ROC .................................................... 67Figure 16 : WMS East-West Interface ................................................................................................................. 71Figure 17 : RAMSES Solution Architectural Diagram ..................................................................................... 89Figure 18 : Reference Architecture ...................................................................................................................... 94Figure 19 : M5000 with System controller and ST2540 connectivity .................................................. 102Figure 20 : Example of NETRA T5440 with System controller connectivity ..................................... 103Figure 21 : Terminal Server Connections ......................................................................................................... 105Figure 22 : WMS security panel ...................................................................................................................... 118Figure 23 : WMS Security system AAA Architecture ..................................................................................... 120Figure 24 : SSH usage with WMS ...................................................................................................................... 135Figure 25 : Recommended Time Synchronization Architecture .................................................................... 140Figure 26 : WCE System Architecture and Management....................................................................... 143

  • Alcatel-Lucent vi

    UMT/OAM/APP/024291 20092013AlcatelLucent

    Figure 27 : 77x0-SR Fault Management on WMS via 5620 SAM ........................................................... 147

  • Alcatel-Lucent vii

    UMT/OAM/APP/024291 20092013AlcatelLucent

    LISTOFTABLESTable 1: WMS Nominal Main Server Capacity .............................................................................................. 24Table 2 : WMS Legacy Hardware Main Server Capacity .......................................................................... 24Table 3: ROC mixed server configuration ..................................................................................................... 27Table 4 : WMS failure scenarios and consequences ................................................................................. 31Table 5 : Maximum recommended threshold alarms per server type................................................... 32Table 6 : Fault Rate of WMS Main Server ...................................................................................................... 34Table 7 : Simultaneous software downloads to Access NE (Nominal machines) ............................ 35Table 8: Simultaneous software downloads to Access NE (legacy machines) ................................. 35Table 9: Typical software size per Access NE ............................................................................................. 36Table 10: Supported GPM data granularities ............................................................................................... 37Table 11 : Maximum Counter Instances supported per RNC Platform type ...................................... 38Table 12 : Maximum iBTS supported with resource tracking enabled per WMS hardware type 40Table 13: Call Trace Type Definitions ............................................................................................................. 44Table 14: Engineering Guidelines for simultaneous CTn Sessions ..................................................... 45Table 15: Example Data Generation for Large/X-Large WMS server for simultaneous CTn

    Sessions ........................................................................................................................................................ 46Table 16: Maximum number of standing alarms per hardware type ..................................................... 52Table 17: Tape drive and Domain/Server matrix compatibility ............................................................... 56Table 18: WMS server data resources ........................................................................................................... 60Table 19: 3GPP FMBB Specifications ............................................................................................................ 68Table 20 : 3GPP CM BB Specifications .......................................................................................................... 68Table 21 : Number of concurrent clients per Main Server type .............................................................. 73Table 22 : Number of Registered users per ROC ........................................................................................ 74Table 23 : SUN SPARC ENTERPRISE T4-1 Hardware Requirements ................................................... 78Table 24 : ORACLE/SUN ENTERPRISE M5000 Hardware Requirements ............................................ 79Table 25 : Windows PC Hardware Requirements for WMS...................................................................... 81

  • Alcatel-Lucent viii

    UMT/OAM/APP/024291 20092013AlcatelLucent

    Table 26 : Terminal Server Console Specifications ............................................................................................ 92Table 27 : Interface Configuration - Configuration A ....................................................................................... 96Table 28 : Interface Configuration - Configuration B ....................................................................................... 97Table 29 : Interface Configuration - Configuration C ....................................................................................... 97Table 30 : Interface Configuration - Configuration D ....................................................................................... 97Table 31 : Interface Configuration - Configuration E ....................................................................................... 98Table 32 : Supported Interface Configurations per server type ....................................................................... 98Table 33 : WMS IP Requirements Summary .................................................................................................. 104Table 34 : Protocols used on southbound Interfaces ........................................................................................ 106Table 35 : SysLog message characteristics ............................................................................................... 115Table 36 : WMS Main Server to OSS Bandwidth Requirements ........................................................... 117Table 37 : NEs supporting RADIUS/IPSec ................................................................................................. 131Table 38 : HP DL380p G8 Hardware Requirements .................................................................................. 144

  • Alcatel-Lucent 9

    UMT/OAM/APP/024291 20092013AlcatelLucent

    1. ABOUTTHISDOCUMENTThis document details the engineering rules for the WMS Main Server, OAM Server hardware/software requirements, OAM DCN recommendations, backup and restore, remote access and other OAM engineering information for WMS. It also provides Engineering information for other network management elements such as 5620 SAM and WCE Platform Manager. Please refer to Scope for the list of supported elements.

    1.1. AUDIENCE FOR THIS DOCUMENT This WMS Engineering guide has been specifically prepared for the following audience:

    - Network Engineers - Installation Engineers - Network & System Administrators - Network Architects

    1.2. NOMENCLATURE

    < >

    :The OAM rules (non negotiable) are typically OAM capacity values, IP parameters addressing (Sub Net, range, etc).

    : A system restriction can be a feature that is not applicable to an OAM Hardware model.

    : Mainly recommendations related to performance (QoS, Capacity, KPI) to get the best of the network

    : Can be an option suggestion, or a configuration note that can be operator dependant.

    1.3. SCOPE This Engineering Guide is for Alcatel-Lucent WMS for release LR14.xW. If some specific feature rules or features are applicable to a specific release, it will be clearly mentioned through a nomenclature section.

    In scope of this Engineering Guide:

    - Alcatel-Lucent 9353 WMS (W-CDMA Management System, previously WMS)

    - Alcatel-Lucent 9771 RNC (WCE) Platform Manager

  • Alcatel-Lucent 10

    UMT/OAM/APP/024291 20092013AlcatelLucent

    - Alcatel-Lucent 9952 WPS (Wireless Provisioning Solution) Multi-Standard

    - Alcatel-Lucent 5620 Service Aware Manager (SAM)

    Not in the scope of the WMS Engineering Guide:

    - other OAM platforms not part of WMS Throughout the document, these different management components will be referred to by their main names such as WMS, WPS, etc.

    Engineering Note: Release Notes reference

    Please note that it is essential to go through the Release Notes of the release and associated patches being installed to understand if there are individual limitations/restrictions in the product that may differ from this document.

    1.4. REFERENCES All references about Alcatel-Lucents WMS can be found in the following Alcatel-Lucent Technical Publications.

    9YZ-05870-0001-ACZZA - Alcatel-Lucent W-CDMA System Document Collection Overview

    For further information on how to obtain these documents, please contact your local Alcatel-Lucent representative.

  • Alcatel-Lucent 11

    UMT/OAM/APP/024291 20092013AlcatelLucent

    2. OVERVIEWAlcatel-Lucent Wireless Management System (WMS) delivers an integrated OAM management platform, across the radio access, IP/ATM backbone and service enabling platform domains. WMS plays an important role in providing the foundation Network Management capabilities to the Alcatel-Lucent solution for complete, end-to-end management of the Wireless Network.

    2.1. NETWORK MANAGEMENT FUNCTIONALITY

    WMS is focused on efficiently delivering the foundation on which to deploy and maintain the Wireless Internet network resources, deliver services, and account for network and service use by subscribers. The key functions of the network management layer are described below.

    2.1.1 NETWORK MANAGEMENT PLATFORM

    Network Services Platform (NSP) is the underlying platform or operating environment that enables management of the network resources and of the services being delivered to customers. The platform provides a single, integrated view of the entire Alcatel-Lucent wireless network across radio access and the service enabling platforms as well as a launch pad for all pre-integrated internal and/or underlying systems and tools.

    2.1.1.1 FAULT MANAGEMENT

    NSP Fault management tools provide an integrated set of fault surveillance, diagnosis and resolution tools that span the domain radio access as well as the service enabling platforms and the IP/ATM backbone, to give the operator a single alarm view across the entire network. These tools enable the operator to identify and resolve network or service affecting issues quickly and efficiently.

    WMS Fault Management functionality for wireless network includes: Alarm Management (real-time alarm surveillance, delivered as an integral part of the NSP) and optionally Historical Fault Management (Historical Fault Browser).

    Also included in WMS Fault Management functionality is the ability to perform alarm filtering, specifically the support of alarm delay and alarm inhibit capabilities on the alarm stream. As well, the ability to modify the alarm severity attribute of the alarm stream allows operators the ability to optimize their alarm handling capabilities.

  • Alcatel-Lucent 12

    UMT/OAM/APP/024291 20092013AlcatelLucent

    2.1.1.2 PERFORMANCE MANAGEMENT

    WMS Performance Management functionality for Wireless networks includes as a base Performance Monitoring (near real-time) and a collection/mediation and conversion to 3GPP compliant XML format for use with any 3rd Party Performance Management tools. The Performance Server functionality co-resides on the WMS Main server. The WMS performance management tools are designed for viewing and optimizing network element and service performance across Alcatel-Lucent radio access (UMTS). Performance Management helps service providers to pinpoint and resolve potential network performance issues before they become a problem to their end customers.

    For Performance Reporting (historical), 9959 NPO (Network Performance Optimizer) is proposed as an option. NPO also provide the WCT (Wireless Call Trace) functionality to optimize neighboring cells, based on Neighboring cell Call Traces as an option.

    2.1.1.3 CONFIGURATION MANAGEMENT

    An integrated set of capabilities designed to configure parameters of all network elements within the Wireless network is provided as part of WMS. Configuration Management has two aspects: off-line configuration tools that are designed to make efficient and effective the most time-consuming configuration activities through pre-integrated assistants for standard configuration activities. On-line configuration is performed via an integrated set of network element-focused configuration tools, accessible directly from the management platform via a context-sensitive launch capability ensuring network element configuration can be done quickly, easily and with minimal risk of errors. WMS Configuration Management functionality includes: Off-line and Online Configuration for the radio access network (UMTS), combined with On-line configuration reach-through across the entire network.

    2.1.1.4 INTERFACE TO UPSTREAM MANAGER

    WMS offers 3GPP OAM standards compliant interfaces to allow customers OSS to manage the Alcatel-Lucent wireless networks. The 3GPP compliant ITF-N interfaces are based on the 3GPP standards, and the solutions offered include support for the Alarm IRP, the BasicCM IRP, the BulkCM IRP (UMTS Access), as well as support of XML transfer of 3G performance counters. The alarm IRP allows fault OSSs to receive, through a 3GPP compliant interface, alarm information from the Alcatel-Lucent wireless networks.

    The BasicCM IRP allows the OSS to discover network elements as well as attributes of the network element. The BulkCM IRP allows the OSSs to bulk provision standards based attributes of the UTRAN networks.

    The support of the XML interface for performance allows performance OSSs to gather performance statistics from the Alcatel-Lucent wireless networks using standards compliant mechanisms.

  • Alcatel-Lucent 13

    UMT/OAM/APP/024291 20092013AlcatelLucent

    2.1.2 HARDWARE PLATFORM

    WMS is delivered on a simple, scalable hardware platform strategy designed to grow effectively with the rollout of Wireless services. The WMS Main Server is dedicated to providing Fault, Configuration, Performance, Security, User and Network Element Access Management among other functionalities.

    The client workstations supported for management of the Wireless network include Sun Workstations and PCs. They host the WMS clients along with standalone configuration tool called WPS.

    Additionally, there are dedicated hardware platforms for optional applications such as NPO, WCE Platform Manager, 5620 SAM, etc.

  • Alcatel-Lucent 14

    UMT/OAM/APP/024291 20092013AlcatelLucent

    3. WMSMAINSERVERENGINEERINGCONSIDERATIONSThis chapter gives an architectural overview and describes the general engineering rules for the WMS Main Server.

    The Main server is the heart of the Network Management platform for managing the Alcatel-Lucent Radio access Network.

    The Main Server functionality provides Performance, Fault, Configuration Management of the UTRAN network, User Access and System Management, Software Repository of the wireless network and 3GPP compliant Itf-N.

    The different components of the Main server are described in the following sections of the chapter.

    3.1. NSP OVERVIEW NSP (Network Services Platform) is an integrated telecommunications network management software platform developed by Alcatel-Lucent that provides a single point of control for the operation, administration, maintenance, and provisioning functions for telecommunications networks in a multi-domain network management environment. NSP uses a scalable client-server infrastructure supported directly by distributed CORBA application building blocks and CORBA device adapters.

    NSP architecture is described as follows:

    - Device adapters collect real-time data from the network (either from the network elements themselves or from the element management systems such as Access Module) and translate network element data into a format that the NSP applications can process.

    - The collected data from the various device adapters is passed to distributed CORBA applications (building blocks).

    - These building blocks (SUMBB, FMBB, TUMSBB) process the data and where necessary summarize it.

    - This processed data is then provided to client applications. Java-based multi-platform enabled GUI clients display the processed data.

  • Alcatel-Lucent 15

    UMT/OAM/APP/024291 20092013AlcatelLucent

    Figure1 : NSP Overview

    3.1.1 NSP COMPONENT OVERVIEW

    This section gives a high-level definition of the different components of NSP.

    3.1.1.1 NSP GRAPHICAL USER INTERFACE

    The NSP GUI is a Java-based GUI with point-and-click navigation. It provides integrated real-time fault management capabilities, the ability to view OSI node state information for data devices (where supported) and a context-sensitive reach-through to underlying EMS and devices. The NSP GUI also provides application launch, customer-configurable custom commands, nodal discovery of devices, technology layer filtering (i.e. wireless, switching, IP transport), access controls, network partitioning, and multiple independent views.

    NSP provides the ability to launch other applications (i.e. element provisioning) directly from NSP using Application Launch scripts, delivering a single point of access to multiple applications. NSP enables easy, in-context reach-through to underlying Network Element interfaces or Element Management Systems (EMS), via the drop-down menu accessible from each NEs icon.

    NetworkElements

    ElementManagementSystem

    DA DA DA.

    FMBB TUMSBB

    SUMBB

    UserGUIClientFaultGUIPlugins OtherGUIPlugins

    ASCII,SNMP,CORBA,CMIP

    NEInfoAlarmInfo

    AlarmCount NE

    DetailedNEInfo/TopologyInfo

    DetailedAlarmInfo/AlarmAckandClear

  • Alcatel-Lucent 16

    UMT/OAM/APP/024291 20092013AlcatelLucent

    3.1.1.2 FAULT MANAGEMENT BUILDING BLOCK (FMBB)

    FMBB acts as the common point of contact to provide integrated alarm information for the entire network. FMBB provides the following fault management interfaces:

    - Alarm Log Monitor interface to allow its clients to retrieve a current snapshot of alarms within the system

    - The alarm manager interface allows clients to monitor alarms on an ongoing basis

    - Control interface to allow clients to acknowledge alarms and manually clear alarms

    FMBB communicates with application clients via the Object Request Broker (ORB) to service requests for alarm and event information. FMBB also communicates with Device Adaptors (DA) via the ORB to retrieve the required data requested by the applications clients.

    FMBB is solely concerned with the current alarms/events for the network, that is, the alarm conditions as they occur or only those alarm conditions which are still active on the network elements. Other service assurance applications, such as the Historical Fault Browser (HFB), address the requirement for alarm history. There is 1 FMBB per WMS Main Server.

    For scalability, multiple instances of FMBB can be deployed. Typically, a network could be subdivided into sub-domains. In such a deployment, one instance of FMBB would be responsible for the alarm information from a single sub-domain.

    3.1.1.3 TOPOLOGY UNIFIED MODELLING SERVICE BUILDING BLOCK (TUMSBB)

    The WMS Topology Unified Modelling Service (TUMS) is used for NE Discovery and Network Layer Management.

    Network Element Discovery is done using an interface between TUMS and the DA. When the DA discovers new NE it reports these to the TUMS. TUMS registers this information with the NSP GUI (via SUMBB) and the NE is available to be added to a Network Layout.

    3.1.1.4 SUMMARY SERVER BUILDING BLOCK (SUMBB)

    The Summary Server (SUMBB) is involved with summarizing fault information passed to it via FMBB, and NE information passed to it via TUMSBB. The NSP GUI then uses this information to report to the user.

  • Alcatel-Lucent 17

    UMT/OAM/APP/024291 20092013AlcatelLucent

    As well as summarizing alarm information, SUMBB is used to store and process all of the information that identifies layouts, groups and NEs within NSP. This provides the means to partition NEs into groups and layouts for different sets of users.

    3.2. FAULT MANAGEMENT APPLICATIONS The following applications provide the operator with additional service assurance features to manage their network:

    3.2.1 HISTORICAL FAULT BROWSER (HFB)

    Historical Fault Browser (HFB) provides a generic event history capability across WMS managed network elements. It has a flexible query mechanism allowing users to aggregate selective alarm history information. Specifically, HFB captures all alarm data for historical analysis, incident reporting, and customer impact analysis. A Web-based graphical user interface (GUI) provides easy accessibility to fault information for troubleshooting in Operations Centres or remote locations. The HFB allows the user to perform the following tasks:

    - Filter an alarm list on any displayed field from the database

    - Display multiple queries at the same time, each in a separate window

    - Sort alarms by any column in the table

    - Display retrieved alarm data in hypertext mark-up language (HTML) report format

    - Store queried data to file

    - Print selective alarm event data from the database

    HFB retrieves raise alarm and clear alarm events from the network via the WMS (Building Block) architecture. Historical Fault Browser automatically supports newly added network elements without any additional configuration required. Alarm events are stored in an Oracle Relational Database Management System (RDBMS).

    HFB Query Interface

    The HFB query interface generates advanced tabular and graphical reports from the HFB and stores them in comma separated (csv) format plain text file. Users can download the file from the primary Main Server to create specific reports using standard tools like Excel.

  • Alcatel-Lucent 18

    UMT/OAM/APP/024291 20092013AlcatelLucent

    To build a report, users need to issue one WICL command with appropriate parameters. At the very least, the command shall envelop a SQL statement which is used to query out result record set and the location of the file to be returned for the user to download then. Users specific WICL commands after being reassembled are turned into one or more pure Oracle PL/SQL statements, and then they are passed through WICL engine to the Shell/Tcl script. The script, then launches SQL/PLUS and executes the SQL statements within a predefined procedure. Finally, the procedure will save the query result to a csv formatted data file which was denoted by the argument in WICL command.

    3.3. FAULT AND CONFIGURATION MANAGEMENT

    This section gives a high-level architecture overview of the fault and configuration management within WMS.

    Figure2 : Fault Management Architecture

  • Alcatel-Lucent 19

    UMT/OAM/APP/024291 20092013AlcatelLucent

    Figure3 : Configuration Management Architecture

    The following sections describe the tools used within the components:

    3.3.1 ACCESS-DA

    The Access Object Model Manager portion of the Access Modules sends fault information to the Access DA. The Access DA ensures the OAM facilities of the Access network by providing the following functionalities for fault management:

    - Receive and store the notifications from the NEs in the access network

    - Convert the notifications into alarms

    - Transmit the alarms to the GUI

    The Access DA also receives fault information from the RNC I-Node via INT_PP_COMM APIs.

  • Alcatel-Lucent 20

    UMT/OAM/APP/024291 20092013AlcatelLucent

    3.3.2 ACCESS MODULE

    The Access Object Model Manager and the Access Device Adapter are actually parts of what is called the UMTS Access Module. The Access Object Model Manager is the element management system for the NODE B & RNC C-Node

    The UMTS Access Object Model Manager portion of the UMTS Access module directly manages these devices over a proprietary messaging interface called SE/PE (over TCP/IP). It then sends the fault information into the Access DA portion of the Access module. It has a shelf-level display of these elements that it manages, which can be launched in-context from the NSP GUI.

    The configuration for the Access devices is controlled and coordinated by the Access module in the most part, except for the RNC IN/POC, which is configured through INT_PP_COMM. Note that the configuration through INT_PP_COMM for the RNC-IN/POC is transparent to the user.

    The whole RNC configuration is performed through the access module: CM XML Files for C-Node, A-Node and I-Node are constructed in Alcatel-Lucent Wireless Provisioning System, then imported in the Access Module that uses the INT_PP_COMM API to transmit CAS commands toward the Multiservice Switch.

    3.3.3 INT_PP_COMM

    From OAM9.1, the INT_PP_COMM module replaces the Multiservice Data Manager (MDM) managing Multiservice Switch devices. A new communication layer with Passport equipment is built based on the FMIP (Fast Management Information Protocol) specification. The FMIP protocol is a point-to-point, connection-oriented protocol similar to the OSI CMIS services with some extensions to meet provisioning requirements.

    From system functionality point of view, all Multiservice Switch (MSS) CM/FM/PM management scenarios and B&R, user administration, etc are kept as the system. MDM GUI and toolset is removed and considered obsolete.

    3.3.4 NETCONF NODE MANAGER

    The NETCONF Node Manager is responsible for managing the 9771 RNC especially the WCEPlatform and WNode part.

    Typical tasks for a Node Manager are:

  • Alcatel-Lucent 21

    UMT/OAM/APP/024291 20092013AlcatelLucent

    Integrate CM operations into CM sessions managed by Configuration Manager. Manage operations as internal Jobs (to be stored in Job historical server) Push alarms to WMS Fault Management layer

    The NETCONF Node Manager is responsible for the implementation of the OAM operations for 9771 RNC available on NETCONF interface. The NETCONF Node Manager is also responsible of all the logic related to RNC9771 WCEPlatform and WNode that needs to be implemented at WMS level.

    3.4. SRS FUNCTIONALITY

    The Software Repository Server is used to store the software installed on the wireless network. This server contains software in a format ready to be used by all the installation tools. The software is obtained from web server or DVD.

    The WMS software tar files are available from: Alcatel Lucent web site (e-delivery) or DVD, in compressed format and the 3rd party software allowed for the compression of these files are gzip (extension.gz), compress (extension.Z) or zip (extension.zip).

    There is only one SRS per ROC located on the WMS Main Server. This SRS can be shared by several ROCs. The SRS functionality on the WMS Main Server covers WMS load patches and Access NE software loads. The SRS contains dedicated software accessible by any web browser. This tool helps the end-user to correctly install the delivery files at the right location on the SRS.

    3.5. PERFORMANCE MANAGEMENT APPLICATION The Performance Management application enables the collection of measurements from the network elements, to export of these measurements in XML format. The measurements can be statistical information (observation) or call trace.

    The following figure represents the architecture of the Performance Management Application.

  • Alcatel-Lucent 22

    UMT/OAM/APP/024291 20092013AlcatelLucent

    Figure4: Performance Management Architecture The different components are explained as follows:

    3.5.1 DATA COLLECTION

    The collection of performance measurements is based on the following components:

    3.5.1.1 ACCESS PERFORMANCE MANAGEMENT (APM)

    The APM is an Access collector. The Access Performance Manager (APM) collects the data from the access network elements using FTP. The APM collects the raw data from RNC & NODE B.

    3.5.1.2 INT_PP_COMM

    From OAM9.1, the INT_PP_COMM module replaces the Management Data Provider (MDP) for the collection for the Multiservice Switch devices (i.e. without any wireless specific software). The CC files collected from the Multiservice Switch based devices are converted into BDF files. After conversion to BDF, BDF files from Multiservice Switch devices are further processed by the PDI.

  • Alcatel-Lucent 23

    UMT/OAM/APP/024291 20092013AlcatelLucent

    3.5.2 DATA MEDIATION

    The performance data collected by APM is converted into the 3GPP compliant XML file format by the following interfaces and processes:

    3.5.2.1 ACCESS DATA INTERFACE (ADI)

    The ADI is the interface for Access Performance and Configuration data to the performance reporting application. ADI mediates counters and call trace data from the devices native format into XML files. ADI converts the raw data to the XML file format in the 3GPP XML interface directory, and aggregates the supported performance data into hourly XML files, which are also placed in the 3GPP XML interface directory.

    3.5.2.2 PACKET DATA INTERFACE (PDI)

    PDI converts the files from Multiservice Switch based devices to an XML format. PDI does not perform time based counter aggregation. However, PDI supports the merging of the multiple files which a Multiservice Switch shelf can generate within a single 15 minute period.

    3.5.2.3 XML COMPRESSION (GZIP)

    All XML data interfaces support XML compression. When this is done, files have an added gzip extension. This is recommended increase storage time on WMS Server as well as to lower bandwidth requirements for transfers to external OSS. External OSS must be compatible or have a mechanism to decompress the XML Files.

    3.6. CAPACITY

    Engineering Rule: Nominal Server capacity

    The Main Server capacities over different nominal platforms (orderable) are defined in the table below.

  • Alcatel-Lucent 24

    UMT/OAM/APP/024291 20092013AlcatelLucent

    Hardware Platform (Nominal) & Network

    Elements

    M5000 (8 CPU) + 1*ST2540 &

    expansion Tray ST2501

    SE T4-1 (1 CPU)

    RNC / WCE 50 10

    NODE B (max 3G cells)

    4000 (24000)

    1000 (6000)

    Table1: WMS Nominal Main Server Capacity

    Engineering Rule: Legacy Server capacity

    The legacy Main Server capacities over different platforms (not orderable) are still supported and defined in Table 2 below.

    Hardware Platform (Legacy) & Network

    Elements

    SF4900 (12 CPU)

    1

    M5000 (4 CPU)+1*ST2540

    & Expansion Tray ST2501

    SF4900 (8 CPU) 1

    NETRA T5440

    (2 CPU)

    SF v890 (4 CPU)

    T5220 (1 CPU)

    RNC / WCE 50 20 20 10 7 3

    NODE B 3000 2000 2000 1000

    700 200

    (max 3G cells) 18000 12000 12000 6000 4200 1200

    Table2: WMS Legacy Hardware Main Server Capacity

    Engineering Note: N240 End of Life

    N240 server is no longer supported from OAM 9.1 and has to be replaced by a similar of higher capacity server (example T5220).

    1 SF E4900 with SE6120 is no longer supported from OAM08. Note that SF E4900 with ST6140 is still supported.

  • Alcatel-Lucent 25

    UMT/OAM/APP/024291 20092013AlcatelLucent

    Engineering Note: WCE Capacity

    Please note the following:

    - WMS manages only the RNC tenant of the 9771 RNC (WCE) i.e the WCEPlatform and WNode.

    - One WCE can only be managed by one WMS.

    - There is no restriction in the number of WCE managed by WMS as long as the total number of cells supported by WMS in the tables above is not exceeded by a given WCE(s).

    3.6.1 DUAL MAIN SERVER CONFIGURATION

    This section provides an overview of the functionality of a dual main server and the corresponding limitations of such a design.

    The configuration consists of two Main servers (Primary and Secondary) with only one instance of the Summary Server Building Block (SUMBB) and one instance of the Historical Fault Browser database (HFB) residing on the Primary Main server. All WMS clients will communicate and get information from the SUMBB and the HFB on the primary main server. When the NSP client requires more details than what is available on the Primary main, it communicates with the TUMSBB and the FMBB of each server.

    The purpose of deploying both a Primary and Secondary Main server is to allow for scalability by the management of a greater number of Network Elements than could be supported on a single Main server.

    Clients always connect to the Primary Main Server. There is no LDAP server on the Secondary. It is transparent to the users which server manages each NE. Users will be able to see the alarms and the configuration for all the NEs. Launching of tools will be performed from the associated main server. For UMTS Access NEs, the associated Main Server is indicated in the UTRAN CM XML files.

    The Dual Main server configuration does not provide any extra level of redundancy than what is available with the single main server. The Dual Main Server configuration does not provide load balancing either. It does provide the appearance at the user level of one large server in that information from the NEs is fed through a single common instance of the Summary Server Building Block (SUMBB) located on the Primary Main server. Note that if the Primary Main Server is out of service, the Secondary Server will provide limited services. Refer to section 3.6.3 for more information with regards to failure scenarios and consequences in the system.

    The deployment of a Dual Main server configuration requires careful advance planning to ensure appropriate NE management and should take into account, among other factors the regional allocation of the NEs themselves. Ideally the planning of a Dual Main server deployment will occur during the CIQ data fill process.

  • Alcatel-Lucent 26

    UMT/OAM/APP/024291 20092013AlcatelLucent

    Figure5: Dual Main Server configuration Also, the Primary Main Server and the Secondary Main Server should be collocated on the same network LAN.

    Each Network Element needs to be integrated on only one server. The integration of the NEs should occur in such a manner that groups of NEs going to be managed together, occur in the same server. This allows a higher number of concurrent user sessions.

    Engineering Recommendation: Capacity Load Sharing

    It is also recommended to distribute the NEs on both Main Servers, keeping in mind that the Primary Main Server capacity is reduced when a Secondary Main Server is connected. Capacity numbers should be reduced by 20% on the Primary Main server when a Secondary Main server is deployed and may be increased by 20% on the Secondary (with the exception of the SF 4X00 and SV v8x0 platform which is already at scalability limits)

    Customers that deploy a dual Main server configuration should give preference to the secondary server when deploying NEs and workload across the two Main servers. The number of clients must still be balanced across the servers.

    FMBB

    SUMBB

    FMBB

    HFB

    Secondary Main ServerPrimary Main Server

    3GPP Interface

    TUMSBBTUMSBB

    Data Collection &

    XML Converter

    Access DA Access DA Data Collection &

    XML Converter

    XMLObsfiles

    XMLObsfiles

    WMS Client

    OSS

    RNC 1

    NodeB 1

    RNC 2

    NodeB 2

    NodeB 3

    NodeB 4

    RNC 3

    NodeB 5

    RNC 4

    NodeB 6

    NodeB 7

    NodeB 8

    Security

  • Alcatel-Lucent 27

    UMT/OAM/APP/024291 20092013AlcatelLucent

    Engineering Note: Mixed server configurations

    The mixed configuration may be required to address a capacity extension scenario by adding a secondary main server based on the nominal hardware platform.

    It is mainly applicable between the same hardware platforms and it is always mandatory to keep the primary main server model superior to the secondary one to support the module distribution.

    For a ROC footprint with a legacy hardware platform SFE4900 or SFV890, the mixed configuration with a nominal hardware M5000 or NETRA T5440 is supported.

    The following table summarizes the list of mixed configurations supported.

    From OAM9.1, for any new hardware ordered i.e. SE T4-1 or M5000, if there is need to add capacity; only the same hardware platform is supported as an addition. So, only 2x T4-1 or 2x M5000 servers can be ordered with no hardware mixture possibility.

    Primary

    Main

    Server

    Secondary

    Main server

    SF4900 (12 CP

    U)

    SF4900 (8 C

    PU

    )

    SF V

    890 (4 CP

    U)

    M5000 (8C

    PU

    )

    M5000 (4C

    PU

    )

    NE

    TRA

    T5440 (2 CP

    U)

    T5220 (1 CP

    U)

    T4-1 (1 CP

    U)

    M5000 (8CPU)

    M5000 (4CPU)

    NETRA T5440 (2 CPU)

    T5220 (1 CPU)

    SF4900 (12 CPU)

    SF4900 (8 CPU)

    SF V890 (4 CPU)

    T4-1 (1 CPU)

    Table3: ROC mixed server configuration

  • Alcatel-Lucent 28

    UMT/OAM/APP/024291 20092013AlcatelLucent

    3.6.2 SYSTEM MANAGEMENT

    From OAM9.1, the Sun Management Centre is replaced by Light Sysmon (or Sysmon) administration tool to supervise OAM server status in aspects of Hardware devices and resources, Operating System services, OAM applications and WMS databases (MySQL, Oracle).

    Light Sysmon tool provides:

    - For the server - Configuration overview and status of the CPU, RAM, Fan, Temperature, Fibre channel connectivity, etc.

    - For the OS OS process details, load statistics, swap statistics, CPU and Memory usage, file system usage on key partitions/directories, file monitoring.

    - For the applications Applications list, status and provision to allow restart manually/automatically

    - For the Database Supervision of Oracle, Mysql

    The Light Sysmon GUI allows for management of the individual modules stated above as well as real time view of the alarms and resource utilization. The GUI is available over Java Web Start (JWS) or through the Health Plugin which is part of the WMS GUI.

    Light Sysmon server runs on the WMS Primary Main server. A Sysmon Agent is deployed in the WMS Primary Main Server and Secondary (if deployed) Main Server. The purpose of the agent is to collect information and supervise all the hardware and software modules running on the server it is attached to. The relative status and alarms are then pushed to the Light Sysmon Server and displayed. Both server and agent are configured to automatically start at server startup.

    There is capability with Sysmon to send alarms to OSS by SNMP.

    Engineering considerations:

    Light Sysmon has the capacity of storing 20000 alarms and can support 20 active user sessions in parallel. Alarm purge is set to 90 days and is configurable.

    The following figure shows the System administration architecture based on Light Sysmon:

  • Alcatel-Lucent 29

    UMT/OAM/APP/024291 20092013AlcatelLucent

    Figure6:Light Sysmon architectureFor complete information of Light Sysmon, please refer to the Alcatel Lucent 9353 Management System Administration: System Management. (9YZ-05870-0018-PCZZA)

    3.6.3 FAILURE SCENARIOS

    In case of dual server configuration, most of the WMS processes are duplicated in both servers to ensure maximum independence, including better load sharing of the management of the Network Elements per server.

    However, the following key processes are hosted by the WMS Primary main only: GUI application, 3GPP IRP (except the 3GPP PM IRP thats available in both servers), HFB service (Historical Fault Browser), Activation manager (for CM XML file management), Security framework (e.g.: Radius service), System Management (SMG), etc

    The following table describes the failure scenarios with regards to a given server crash with its associated consequences.

  • Alcatel-Lucent 30

    UMT/OAM/APP/024291 20092013AlcatelLucent

    SERVERIMPACTED NATUREOFIMPACT COMMENTS

    PrimaryMainserverdown

    LossofallWMSClients SupervisionandoperationnomoreavailablefromalltheWMSClients.

    LossofFMsupervision

    GUIs arenotavailable.(SUMBBdown)

    WICL WICLnotavailableCMnotavailable CMoperationnotavailable

    includingtheusageoftheActivationmanager(WPS)

    Datacongestion(exceptforNEmanagedbythesecondarymainserver)

    NEs attachedtotheprimarymainservercannotsenddata(Notifications,alarms,countersandtracefiles).TheNEsattachedtothesecondarystillpushdatatothesecondarymainserver.

    Security NoRadius forNEsupportingRadius(localconnectiontoNEnotavailable)

    LossofOSS(exceptforPMIPRintheSecondarymainserver)

    - OSScannotconnecttothesystem(SecurityIRP)

    - NoNotification,FMandbasicCMwithOSS

    - PMIRPnotavailableintheprimarymainserver(stillavailableinthesecondarymainserver)

    HFB UnabletostoreHistorical

    alarms(comingfromtheNEmanagedbythesecondary

  • Alcatel-Lucent 31

    UMT/OAM/APP/024291 20092013AlcatelLucent

    mainserver)LossofSystemMonitoring Sysmonnotavailableforallthe

    WMSsystem

    SecondaryMainserverdown

    LossofFMsupervision(forNEattachedtothesecondarymainserveronly)

    TheNEattachedtothesecondarymainservercannotsendalarmstoWMS

    CMNotavailable(FortheNEattachedtothesecondarymainserver)

    CMoperationsagainstNEmanagedbythesecondarymainserverarenotavailable

    Datacongestion(exceptforNEmanagedbytheprimarymainserver)

    NEsattachedtothesecondarymainservercannotsenddata(Notifications,alarms,countersandtracefiles).TheNEsattachedtotheprimarystillpushdatatothesecondarymainserver.

    LossofOSS PMIPRintheSecondarymainserver

    PMIRPnotavailableinthesecondarymainserver

    LossofprocessMonitoringintheSecondarymainserver

    Theprocessesrunningonthesecondarymainserverarenomoresupervised.

    Table4 : WMS failure scenarios and consequences

    3.6.4 ALARM ON THRESHOLD

    Main Server capacity numbers assume that the Alarm On Threshold feature is not enabled. It is expected that Alarm On Threshold can increase server resources usage. Care is required when configuring this feature to ensure that the thresholds that are set will not generate a flood of alarms which would have an adverse impact on the system performance.

    For Access Alarm on Threshold, the flooding protection limits to 300 the number of alarms raises that can be generated (raised) by one threshold per Network Element (RNC or NODE B) in one single evaluation period (an evaluation periods line up with the counter granularity (reporting period) so this

  • Alcatel-Lucent 32

    UMT/OAM/APP/024291 20092013AlcatelLucent

    is typically 15 minutes). If the number of 300 alarms against one threshold is reached, then only one flood alarm is sent.

    In defining the thresholds, the following recommendations should be considered on the total number of threshold alarm events (both raise and clear alarms) which can be generated per evaluation period (typically 15 minutes). The goal of these recommendations is to avoid producing so many threshold alarms that they would impact the regular flow of network element alarms (even when the NE alarm rates are high).

    Engineering Rule: Maximum recommended threshold alarms per server type

    The recommended maximum number of threshold alarm events per evaluation periods is server type dependant and are defined in the following Table 5 below.

    Hardware Platform Threshold Events (raise + clears) per evaluation period

    SE T5220 400

    SF v890, NETRA T5440, SE T4-1 1000

    SF E4900,

    SE M5000

    1500

    Table5:Maximum recommended threshold alarms per server type

    In case of a ROC composed of 2 main servers, it can be assumed that the number in this table can be applied to each server (primary or secondary).

    In assessing the number of alarms events that can be generated by a threshold, the number of instances that each threshold applies to needs to be considered (for example, a single FDDCell threshold can have thousands of instances). It is recommended that the threshold definitions be tested (or simulated) against the actual real values of the counters prior to actual implementation on the server to ensure that they are well defined and dont produce an excess of alarms. The fact that the threshold evaluation was done over a network which was probably in normal running condition should be taken into account also in trying to assess worst case conditions and threshold alarm rates (some network conditions could have impact and increase rates beyond what was measured with sample data).

    In implementing thresholds, it is recommended that a progressive approach be taken when setting the thresholds values (i.e. setting them initially to a threshold crossing value which generates few alarms and then adjust the threshold crossing value incrementally over a longer period of time). Also, for

  • Alcatel-Lucent 33

    UMT/OAM/APP/024291 20092013AlcatelLucent

    UMTS Access, the use of the hysteresis capability of the threshold feature can be useful, especially when the threshold crossing value is somewhat closer to the normal average value of the counter or metric against which the threshold is defined.

    Some examples for the evaluation of the impact of the use of this feature are given below and are specific to UMTS Access.

    Case Example 1:

    An operator wants to define 3 different thresholds against some specific FDDCell counters and 5 other thresholds against base RNC counters. The network is composed of 1500 NODE Bs with 3 cells each and 15 RNCs (100 NODE B per RNC). The concern is if these thresholds could have an impact on the main server.

    Assessment No. 1: worst case with flood alarms

    A very extreme worst case scenario would be for all the threshold instances to raise a maximum number of threshold alarms simultaneously (in one evaluation period). In this particular case, the FDDCell thresholds would reach their limit of 300 alarm events per RNC and would all be replaced by 1 flood alarm.

    The 5 RNC thresholds cant generate more than 15 alarms each (15 RNCs total) and the 3 FDDCell would only generate 1 alarm per RNC each So in this worse case analysis, these definitions would generate a burst of 15 RNCs x 8 alarms = 120 alarms.

    Assessment No. 2: worst case without flood alarms

    This assessment shows a worst case analysis which is based on scenarios which generate the maximum number of alarms (in one granularity period) without generating a flood alarm. In this case, only the FDDCells thresholds can reach a possible amount of 299 alarms raise instances per threshold in one interval. Since there are 15 RNCs and 3 FDDCell thresholds, such a worst case scenario would yield 15 (RNC) x 3 (thresholds) x 299 (alarms) =13455 alarm raise events! This by far exceeds the limit of what is recommended to generate on any server type (the impact of such a burst would be that other alarms from NEs could be delayed by many minutes). This example goes to show that the best way to do these type of assessments is per the technique used in the case study 2 below and based on probabilities rather than on worst case scenarios.

    We continue this case assessment assuming that we have done a more detailed assessment into the behavior of the nature of the alarms generated by these particular threshold definitions and we have found out that it would be practically impossible that the 3 FDDCells thresholds would generate a high number of alarms on more than 1 RNC at any point in time. So this means that the maximum number of threshold alarms which could be raised in one period becomes 1 RNC x 3 threshold x 299 alarms = 897 alarms, a number which can be managed on servers of type 890 and 4900.

    Case Example 2: (recommended assessment methodology)

  • Alcatel-Lucent 34

    UMT/OAM/APP/024291 20092013AlcatelLucent

    An operator is interested in implementing many thresholds on a series of FDDCell based counters on a SF 4900 based ROC which is managing 6000 cells. Operator sets the threshold crossing values in such a way that under normal conditions only 0.1% of components (cells) have a threshold alarm raised against it. To be safe, the operator assumes that in some extreme conditions, this number can increase 20 times (to 2%). It has been observed that when a threshold is raised it normally stays raised for 2 intervals. It has also been observed that threshold crossings are statistically independent from one cell to another and more or less uniformly distributed over time (this is to keeps this example simple)

    In this case, using the assumed worst case value of 2% of 6000 cells, we have at any point in time 120 cells which are in an alarmed state for each threshold. The average hold time of these alarms lasting for 2 periods means that we will have 120 raise alarm events and 120 clear alarm events per 2 periods, so 120 alarm events per period. The maximum recommended value for the number of threshold alarms for a SF 4800 main server is 1000. We could therefore support 8 of these types of thresholds, a number which is below the maximum number of thresholds which can be applied to the FDDCell counter group.

    3.6.5 FAULT RATE OF WMS

    There are three types of rates defined for the alarms on the WMS Main Server:

    - The sustained rate is used to demonstrate that faults move freely at the Main server level even with NEs sending a lot of events. At the sustained rate, there is no queuing in the Main server.

    - The average rate is equal to the sustained rate divided by 10. At the average rate all the operations done on the Main server are fluid.

    - The peak (burst) rate is used to demonstrate that no information is lost and the Main server doesnt crash even under very difficult conditions. The peak rate generates queuing and delay can occur. This represents a situation that can occur on rare occasions on live networks.

    Please find in the table below fault rate supported by the WMS Main Server.

    Hardware Type Sustained rateFault events /sec

    Peak rate Fault events /sec

    M5000 (4 or 8 CPU) 70 90

    SF E4900 (8 or 12 CPU) 70 90

    Netra T5440, SE T4-1 50 60

    SF V890 50 60

    SE T5220 20 25

    Table6 : Fault Rate of WMS Main Server

  • Alcatel-Lucent 35

    UMT/OAM/APP/024291 20092013AlcatelLucent

    Engineering Rule: Fault Rate per ROC

    Note that for the ROC containing 2 WMS Main Server (Primary and Secondary), the sustained rate remains the same as the Primary Main Server.

    3.6.6 AUDIT TRAIL

    Although more events are being logged when this option is set to Level 3, this will not have a significant impact on server resources and thus will not impact the capacity specifications.

    3.6.7 SOFTWARE DOWNLOAD

    The SRS functionality on the Main server allows download of software via e-delivery. Software for the WCE/RNC/NODE B can be downloaded from the SRS and then downloaded to the respective NEs via the Main Server. Parallel software download to multiple NEs of the same type is supported as mentioned in the following table. User can select more NEs than in the table but they will get queued to not exceed the values of parallel FTP transfers.

    Hardware Platform Node B

    RNC

    SE M5000 (8-CPU)

    250

    6

    SE M5000 (4-CPU) 250 4

    NETRA T5440 (2-CPU), SE T4-1 (1 CPU)

    100 4

    SE T5220 (1 CPU) 50 1

    Table7 : Simultaneous software downloads to Access NE (Nominal machines)

    Hardware Platform Node B RNC

    SF E4900 (12-CPU) 250 6

    SF E4900 (8-CPU) 250 4

    SF V890 (4-CPU) 100 2

    Table8: Simultaneous software downloads to Access NE (legacy machines)

  • Alcatel-Lucent 36

    UMT/OAM/APP/024291 20092013AlcatelLucent

    A typical example of software size per Network Element type is provided in the table below (this is for BTS (iBTS) and RNC 1500 in release UA7.1).

    NE Type Software size (MB)

    NODE B 60

    RNC 1000

    Table9: Typical software size per Access NE

    For the sshd daemon on WMS server, in the file /etc/ssh/sshd_config, the parameter MaxStartups is recommended to be set to 40: MaxStartups 40:50:100

    3.6.8 RNC BACKUP IMAGE

    WMS provides the capability to backup RNC images from WICL command line or thru the WMS graphical user interface. Several RNCs backup can be launch in parallel with a maximum of 10 RNC backups supported at a time (more RNC can be selected but will be queued in the system)

    The time required to achieve the production of a RNC backup image is less than 5 minutes per RNC (considering that ALU transmission requirements are followed)

    The recommendation for business continuity is to perform regular backup during non activity period (depending of the customer provisioning window) relying on crontab triggering the command line mode. WMS automatic purge mechanism allows historical data of one week of backup images

    3.6.9 PERFORMANCE MANAGEMENT

    3.6.9.1 GENERAL PERMANENT OBSERVATION FILES GRANULARITIES

    Table below shows the data granularities of the GPO (General Permanente Observation) supported by this release and used to determine the storage and capacity considerations of the server. Data granularities are the rate at which the performance counters can be generated by the network elements and is usually the rate at which performance data can be transferred from the network element to the server.

  • Alcatel-Lucent 37

    UMT/OAM/APP/024291 20092013AlcatelLucent

    NE Granularity (Minutes)

    NODE B 15, 60

    RNC / WCE 15, 30, 60

    Table10: Supported GPM data granularities

    15 minute Node B granularity has been available for Node B from UA06 onwards. The default granularity of the Node B is 60 minutes. It is recommended to leave Node B granularity at default to optimize storage and bandwidth utilization.

    Engineering Recommendation: Setting GPM Granularity Period on NodeB

    Granularity period of 15 minutes can be set on individual NodeB. It is recommended that unless troubleshooting a particular NodeB, the granularity period for collecting counters should be set to the same period for all NodeB to avoid misalignment when performance reporting.

    3.6.9.2 WCE/RNC COUNTER LIST MANAGEMENT

    Counters list management allows users to specify the list of counters (Screening) to be activated on the WCE/RNC, and to have the collection and mediation layer of WMS aligned with this counters list

    The advantages of this feature are:

    To have WCE/RNC dedicating its resources to call management, as opposed to call monitoring. Some counters are linked to a UTRAN feature. If the feature is not activated, the counter is not

    used. Some other counters need to be activated at the time a new feature or functionality is introduced or for a specific optimization service but does not need to be active from a day to day network operational basis. This allows the customer to select what they want.

    The counters list can be defined manually and activated through WICL with the addition of csv formatted ASCII files. A GUI WICLET is also available in order to facilitate counter list edition. This WICLET is implemented to serve as a front-end for the counter list manipulation, especially in bulk management (FTP transfers between WMS client and WMS server are therefore manage automatically).

  • Alcatel-Lucent 38

    UMT/OAM/APP/024291 20092013AlcatelLucent

    Engineering Note: Scope of Counter selection

    With WCE/RNC, the selection is available at individual counter level (screening). This enables a selection at a fine grain, such as to reduce the load on the WCE/RNC. One exception being the groups of five counters which have the same name except for ending in .Avg, .Min, .Max, .NbEvt and .Cum which only can only be selected or deselected together as groups.

    The counters list csv file has several parameters per counter including isActivated, group, name, measuredobjectclass, weight and priority. The user can estimate the weight associated to its customized WCE/RNC counter list by using the weight values of the counter whose isActivated value is set to Y, supposing all counter from the same group have an identical isActivated value. The way to estimate the weight is as follow:

    1. The total weight of the WCE/RNC counter list can be estimated by considering only the counters whose measuredObjectClass field is set to RNCFunction/Cell, summing up the value of their weight field, and multiplying the result by the number of cells configured or projected for the WCE/RNC considered.

    2. The RNC counter list total weight can then be compared to the RNC max counter capacity (tabme below). The RNC max counter capacity depends on the RNCs Inode platform type. The RNC platform type is given by the value of the Inodes attribute EM/RncIn/hardwareCapability, available through WICL.

    RNC Platform Type

    HW capability provisioning

    (Max Cells supported)

    Maximum Counter

    Instances supported

    DCPS used with CP3

    allPktServSP2

    (1200)

    4.75 million

    CP4+MS3

    allPktServSP2FullDim

    (2448)

    9.5 million

    Table 11 : Maximum Counter Instances supported per RNC Platform type

  • Alcatel-Lucent 39

    UMT/OAM/APP/024291 20092013AlcatelLucent

    Engineering Recommendation: Activation and Capacity rule

    Even if it is still possible to configure the list of counters and counters families through the regular way of provisioning (WIPS, Object Editor, WICL..) of the WCE/RNC by modifying the counterIdList/familyIdList PerfScanners attributes, it is highly recommended to use the WICL command dedicated to Counters List Management in order to avoid any unexpected counters drop by WCE/RNC.

    Prior to the counter list activation, it is highly recommended to estimate the load of the list through a Check command. When using the check command, the end user is invited to specify the projected cells likely to be configured at most for this WCE/RNC and the WCE/RNC platform type. Based on those information WICL is able to estimate the total weight associated to a given counter list file, expressed in percentage of the WCE/RNC max counter capacity.

    A counter list should not be activated if the load on the WCE/RNC exceed 80%

    Engineering Recommendation: Overall Monitoring

    This feature can be set on individual WCE/RNC. It is recommended that unless troubleshooting a particular WCE/RNC, the activated counters list should be set identically to all WCE/RNC to avoid misalignment when performance reporting.

    Engineering Note: WCE/RNC Counter List Management

    - Note that even if counters are deactivated in the csv counter list file, they will still appear in the 3GPP observation xml files produced by WMS with null values.

    - If the number of cell-counters records reaches or exceeds 80% of the WCE/RNC capacity limit, the WCE/RNC raises a Warning alarm.

    - The priority field value indicates the priority associated to the counter, as defined by R&D, and implemented in the WCE/RNC. The higher the priority value, the less important the counter is considered. In case of resource shortage, the WCE/RNC will stop collecting counters from high priority values.

    3.6.9.3 IBTS RESOURCE TRACKING LIMITATION

    A new feature in LR14W - iBTS Resource Tracking using PM Counters introduces more than one thousand counters in iBTS, if all the counters are activated for all the locations for a long duration, with higher BTSCells number (9 Cells/1CCM/5CEM or above), there will be a risk for WMS PM KPI in data recovery situation. In order to not impact the WMS PM KPI under different WMS hardware configuration, the maximum number of iBTS which can be activated with iBTS resource tracking counters is limited. WMS allows 50 iBTS by default to activate iBTS resource tracking counter. However, this number can be configured in the configuration file /opt/nortel/config/utran/config/runtime_properties/default/properties.xml

    To realize the limitation, the checks will perform on CM parameter "BTSEquipment.isResourceTrackingActivated". When the parameter

  • Alcatel-Lucent 40

    UMT/OAM/APP/024291 20092013AlcatelLucent

    "BTSEquipment.isResourceTrackingActivated" is modified to be "TRUE", CP will have a check on the total number of iBTS contains the value "true" of this parameter on WMS server. If the number is greater than configurated value, 50 by default, the error will be reported and the modification is refused.

    The following table gives the restrictions of WMS type and the number of iBTS with resource tracking enabled.

    WMS hardware configuration

    Total iBTS number

    iBTS number limitation for Resource Tracking

    (9Cell/1CCM/5CEM)

    XLarge 4000 500

    XLarge 3000 400

    Large 2000 200

    Medium 1000 100

    Medium 900 100

    Small 200 50

    Table 12 : Maximum iBTS supported with resource tracking enabled per WMS hardware type

    3.6.9.4 RECOVERY (CATCHUP TIME RATIO)

    For a server which is fully loaded with the number of NEs, the catch up time ratio is around 1:1. This means that if the server is down (outage, network outage, patch installation, maintenance, etc....) for one hour, it will take one hour to catch up (or 1 day of catch up for 1 day down). Once caught up, the server is back to its normal steady state and all the XML files are delivered as per their normal schedule.

    Note that in some special circumstances like during intensive use of RNC call trace, the time required to recover from an outage can increase. For planned outages (for example during an upgrade), it is recommended to stop call trace sessions on the WCE/RNC before the outage.

    3.6.9.5 XML FILE STORAGE CAPACITY

    XML File compression (gzip)

    All data mediation processes have the capability to generate compressed XML files, including UMTS Access Call Trace. The compressed files will be in the gzip format (with.gzip extension).

  • Alcatel-Lucent 41

    UMT/OAM/APP/024291 20092013AlcatelLucent

    Engineering Recommendation: XML File Storage

    It is recommended that compressed gzip file format always be used for storing performance management files specially when NodeB granularity period is set to 15 minutes as there are several advantages:

    - more XML files can be stored

    - less bandwidth is required to transmit them over the network,

    - backup and restore is faster

    As a general rule, the compression ratio achieved (gzip format) is typically of 90% or better so up to 10x more data could be stored using this format.

    Global WMS purge functionality

    The purging algorithm is applied on the Main Server to global partitions. The XML storage (for observations/counters and call trace data, etc.) is stored in the /opt/nortel/data partition and the purge algorithm will attempt to maintain this global partition at 80% of usage. This leads to a more efficient usage of the disk space so that in general it is expected that a WMS server is capable to store more days of XML data than what was possible in previous releases.

    The purge functionality can possibly bring on some noticeable changes in the amount of days stored for different type of networks. One reason for this is that the number of days of storage of XML data which can be kept is dynamically assessed so this parameter can actually vary over time. Also, the amount of storage days is applied uniformly across all data on the server. In all cases, WMS server should be able to keep a minimum of 3 days of XML data

    3.6.9.6 REAL TIME COUNTERS MONITORING

    The feature provides a restricted set of observation metrics from selected cells, with a granularity period and reporting period of a few minutes. Real Time counter activation does not interfere with the Normal General Permanent Observation (GPO) behavior.

    The purpose of this feature is to be able to visualize quickly UTRAN synthetic counters values and evolution as soon as available during special events (e.g.: large people meeting at a given place...) or after Local parameter change (e.g.: new site configuration...).

    These measurement results are made available on a dedicated GUI integrated in to the WMS GUI client, either as a table or as graphs.

  • Alcatel-Lucent 42

    UMT/OAM/APP/024291 20092013AlcatelLucent

    Engineering Note: Scope

    RTO (Real Time Observation) is applicable to a predefined list of metrics and cannot be customized. There are 18 metrics, all of which are grouped by three for total count, count for PS calls and count for CS calls:

    - Call attempts: attempts, attempts CS and attempts PS

    - Call setup successes: setups, setups CS and setups PS

    - Setup success rate: setup%, setup CS% and setup PS%

    - Call drops: drops, drops PS and drops CS

    - Call drop rate: drop%, drop CS% and drop PS%

    - Traffic expressed in number of bytes: traffic; traffic CS and traffic PS.

    These metrics are calculated by the RTO PM application on the WMS server using counters that are available on both Cell and RNC observed objects.

    System Restriction WCE Support of Real Time Observation

    WCE does not currently support Real Time Observation counters.

    The period of counters availability at WMS GUI should be near to 5 minutes considering the granularity of collection fixed to 3 minutes with an end to end reporting latency of about 2 minutes. With a good network characteristic that complies with the WMS bandwidth recommendations, the latency should be reduced to 1 minute.

    Engineering Note : Connectivity and Firewall consideration

    The usage of the feature required that customer network allows UDP traffic on port 11980 originating from the observed RNC and terminating at the WMS server. The Firewall rules2 between OAM and RNC (OAM_NE interface) should be applied accordingly.

    The following picture summarizes the overall architecture and communication channels.

    2 : Please see " UMT/OAM/APP/024293 Alcatel-Lucent 9353 WMS - Ports and Services", for the list of ports used within the WMS ROC Perimeter.

  • Alcatel-Lucent 43

    UMT/OAM/APP/024291 20092013AlcatelLucent

    Figure7:Overall architecture and communication channels

    Engineering Recommendation: Activation and Capacity Limitation

    It is recommended to activate the feature on a selected Zone with a limited number of RNC and cells, and during a limited troubleshooting period.

    The recommendation is to activate the reporting for up to two RNC at time and the system will not accept more than 120 cells within each RNC. For multiple users (One Real Time synthetic counter session per client instance allowed) this capacity limitation has to be considered accordingly. (e.g. : if 10 users have defined each a RTO against 12 different cells on same RNC, another user will not be able to modify the RTO with more cells on this same RNC)

    If not invoked by the end user from the GUI, the RNC stops reporting RTO data after a certain amount of time (24 hours).

    3.6.9.7 UTRAN CALL TRACE

    WMS server supports the following types of call trace:

  • Alcatel-Lucent 44

    UMT/OAM/APP/024291 20092013AlcatelLucent

    Call Trace Session Purpose

    Neighbouring Call Trace CTn To trace mobility specific events and trace handovers between neighbouring cells

    Core Network Invoked Call Trace CTa

    To trace one or several UE calls selected by the Core Network and to trace UE emergency calls

    Access Invoked Call Trace CTb To trace dedicated data for calls based on a predefined UE identity (TMSI, P-TMSI, IMSI or IMEI)

    Geographic Call Trace CTg To trace dedicated data for calls established within a geographical area in the UTRAN (may be a cell, a set of cells or all the cells in the RNS)

    Object Trace on the cell object OTCell

    To trace data related to a cell or several cells (i.e. NBAP common measurements: Transmitted Carrier Power, RSSI

    Object Trace on IuCs object OTIuCs

    To trace common Iu CS data (i.e. not linked to a given call) RANAP messages on Iu Cs interface.

    Object Trace on IuPs object OTIuPs

    To trace common Iu PS data (i.e. not linked to a given call) RANAP messages on Iu Ps interface.

    Object Trace on IuR object OTIuR To trace common Iur data (i.e. not linked to a given call) RNSAP messages on Iur interface

    Object Trace on IuBC object OTIuBC

    To trace common (i.e. not linked to a given call) SABP messages on IuBC interface

    Table13: Call Trace Type Definitions

    Engineering Note: CTp and CTf support

    Please note that CTp (PermanentSession) and CTf (FailureSession) are not supported and reserved for future use.

    CTp session is applicable to WCE only and not applicable to RNC 9370, therefore CTp session is supported by TCE only and not by WMS.

  • Alcatel-Lucent 45

    UMT/OAM/APP/024291 20092013AlcatelLucent

    Engineering Note: Call Trace Support for WCE

    9771 RNC (WCE) Call Trace will be managed by a new entity TCE (Trace Collection Entity) as a standalone server. The original WMS CT is dedicated to 9370 RNC call trace only.

    For more information on TCE, please refer to 9YZ-06174-0113-PGZZA - NPO Engineering Guide.

    From OAM9.1, there is a significant increase in the Call Trace data with a feature allowing 100% CTn to be activated per RNC and re-alignment of CTg, CTa, CTb and OTx as per the dynamic capacity of the RNC. This is done thanks to the capacity evolution on the RNC DCPS/eDCPS modules allowing for further enhancements of CT capabilities that need to be supportd by the WMS.

    CTn tracing capacity realignment and CTn 100% UE calls tracing

    As from OAM9.1, the following table summarizes the number of simultaneous CTn sessions supported by WMS Main Server.

    Number of simultaneously supported CTn Sessions

    Server Type CTn Sessions

    T5220 (Small) 3

    SF v890 (Medium) 7

    T5440, T4-1 (Medium) 10

    M5000 (4 CPU) & E4900 (8 CPU) (Large) 20

    M5000 (8 CPU) & E4900 (12 CPU) (X-Large) 50

    Table14: Engineering Guidelines for simultaneous CTn Sessions

    Based on the above capacity supported by WMS, WMS collects a large amount of CTn data with a retention period set to 1 day. As an example, for a Large and X-Large configuration, the amount of data generated per day can be as follows dependant on the traffic profile of the RNC:

  • Alcatel-Lucent 46

    UMT/OAM/APP/024291 20092013AlcatelLucent

    Server Size Large X-Large

    Maximum CTn sessions activated in WMS simultaneously

    20 50

    Maximum uncompressed XDR rate per RNC (MB/min)

    9.2 9.2

    Maximum uncompressed XDR rate per WMS (MB/min)

    184 460

    Maximum compressed XDR rate per WMS (MB/min)

    73.6 184

    Calls processed simultaneously per session

    24000 24000

    Table15: Example Data Generation for Large/X-Large WMS server for simultaneous CTn Sessions

    CTn tracing capacity realignment and CTn 100% UE calls tracing is defined by two parameters

    - neighbouringSessionTracingFloor

    - neighbouringSessionTracingRatio

    If both are set to 0, the feature shall not be activated by the RNC for the related CTn session and the CTn capacity limits shall remain unchanged for the related CTn session.

    If either of the two parameters or both these parameters are set to a value distinct from 0, then the feature shall be activated by the RNC for the related CTn session.

    If either of the two parameters or both these parameters are set to a value distinct from 0, and the parameter "neighbouringSessionTracingRatio" is not set to the value corresponding to 100%, then the RNC shall dynamically and automatically adapt the CTn tracing capacity depending on the effective traffic handled by the RNC and the requested CTn tracing level configuration defined by these two parameters.

    If the parameter "neighbouringSessionTracingRatio" is set to the value corresponding to 100%, then the RNC shall trace 100% of the UE calls for the related CTn session.

    The maximum number of simultaneous CTn traced calls per TMU, N, shall be for each TMU the maximum between the two following values:

    - the absolute minimal number of calls per TMU to be traced by the CTn session independentl