Download - Method of Fault Locating
-
8/11/2019 Method of Fault Locating
1/70
Huawei Transport Network MaintenanceReference (Volume 5)
RTN Microwave
Issue 01
Date 2011-12-30
HUAWEI TECHNOLOGIES CO., LTD.
-
8/11/2019 Method of Fault Locating
2/70
-
8/11/2019 Method of Fault Locating
3/70
Huawei Transport Network Maintenance Reference
(Volume 5)
RTN Microwave About This Document
Issue 01 (2011-12-30) Huawei Proprietary and Confidential
Copyright Huawei Technologies Co., Ltd.
i
About This Document
OverviewFor assisting maintenance engineers in troubleshooting, this document describes how totroubleshoot OptiX RTN products, and is organized as follows:
Basic principles and common methods for locating faultsThis chapter describes basic principles and common methods for locating faults. Eachmethod is illustrated using an example.
Troubleshooting process and guide
This chapter describes the general troubleshooting process, fault categories, and how todiagnose each category of faults.
Equipment interworking guide
This chapter provides criteria for correct interworking between OptiX RTN products and
other products, and methods used for locating interworking faults.
Typical cases
This chapter provides typical troubleshooting cases for helping maintenance personnelimprove their fault diagnosis capabilities.
Appendix
This chapter provides references.
Intended AudienceThis document is intended for:
Technical support engineers
Maintenance engineers
Symbol ConventionsThe symbols that may be found in this document are defined as follows.
Symbol Description
Indicates a hazard with a high level of risk, which if notavoided, will result in death or serious injury.
-
8/11/2019 Method of Fault Locating
4/70
Huawei Transport Network Maintenance Reference
(Volume 5)
RTN Microwave About This Document
Issue 01 (2011-12-30) Huawei Proprietary and Confidential
Copyright Huawei Technologies Co., Ltd.
ii
Symbol Description
Indicates a hazard with a medium or low level of risk, which ifnot avoided, could result in minor or moderate injury.
Indicates a potentially hazardous situation, which if not
avoided, could result in equipment damage, data loss,
performance degradation, or unexpected results.
Indicates a tip that may help you solve a problem or save time.
Provides additional information to emphasize or supplement
important points of the main text.
General Conventions
The general conventions that may be found in this document are defined as follows.
Convention Description
Times New Roman Normal paragraphs are in Times New Roman.
Boldface Names of f iles, directories, folders, and users are inboldface. For example, log in as user root.
Italic Book titles are in italics.
Courier New Examples of information displayed on the screen are in
Courier New.
Change HistoryUpdates between document issues are cumulative. Therefore, the latest document issue
contains all updates made in previous issues.
Updates in Issue 01 (2011-12-30)
This issue is the first formal release.
-
8/11/2019 Method of Fault Locating
5/70
Huawei Transport Network Maintenance Reference
(Volume 5)
RTN Microwave Contents
Issue 01 (2011-12-30) Huawei Proprietary and Confidential
Copyright Huawei Technologies Co., Ltd.
iii
Contents
About This Document ........ ........ ........ ........ ........ ........ ........ ........ ........ ........ ........ ........ ....... . i
1 Basic Principles and Common Methods for Locating Faults.......... .... .... .... .... .... ....... .... . 1
1.1 Basic Principles for Locating Faults ................. .......... ......... .......... .......... ......... .......... ........... .......... ......... 1
1.2 Common Methods for Locating Faults .......... .......... .......... ......... .......... .......... ......... .......... .......... ......... .... 2
1.3 Signal Flow Analysis .............................................................................................................................. 3
1.3.1 Application Scenarios..................................................................................................................... 3
1.3.2 Method Description........................................................................................................................ 3
1.3.3 Application Example ................... .......... ......... .......... .......... ......... .......... .......... ......... .......... .......... .. 3
1.4 Alarm and Performance Analysis .... ........... .......... ......... .......... .......... .......... .......... .......... .......... ......... ...... 4
1.4.1 Application Scenarios..................................................................................................................... 4
1.4.2 Method Description........................................................................................................................ 5
1.4.3 Application Example ................... .......... ......... .......... .......... ......... .......... .......... ......... .......... .......... .. 5
1.5 Receive and Transmit Power Analysis ................. .......... ......... .......... .......... ......... .......... ........... .......... ...... 6
1.5.1 Application Scenarios..................................................................................................................... 6
1.5.2 Method Description........................................................................................................................ 6
1.5.3 Application Example ................... .......... ......... .......... .......... ......... .......... .......... ......... .......... .......... .. 6
1.6 Loopback ............................................................................................................................................... 7
1.6.1 Application Scenarios..................................................................................................................... 7
1.6.2 Method Description........................................................................................................................ 7
1.6.3 Application Example ................... .......... ......... .......... .......... ......... .......... .......... ......... .......... .......... .. 9
1.7 Replacement......................................................................................................................................... 10
1.7.1 Application Scenarios...... .......... .......... ......... .......... .......... ......... .......... .......... .......... ......... .......... .. 10
1.7.2 Method Description...................................................................................................................... 10
1.7.3 Application Example ................. .......... ......... .......... .......... ......... .......... .......... .......... ......... .......... .. 11
1.8 Configuration Data Analysis .......... .......... .......... ......... .......... .......... ......... .......... .......... ......... .......... ....... 12
1.8.1 Application Scenarios...... .......... .......... ......... .......... .......... ......... .......... .......... .......... ......... .......... .. 12
1.8.2 Method Description...................................................................................................................... 12
1.8.3 Application Example ................. .......... ......... .......... .......... ......... .......... .......... .......... ......... .......... .. 12
1.9 Tests Using Instruments and Tools .... .......... .......... ......... .......... .......... .......... ......... .......... ........... ........ .... 13
1.9.1 Application Scenarios...... .......... .......... ......... .......... .......... ......... .......... .......... .......... ......... .......... .. 13
1.9.2 Method Description...................................................................................................................... 13
1.9.3 Application Example ................. .......... ......... .......... .......... ......... .......... .......... .......... ......... .......... .. 14
-
8/11/2019 Method of Fault Locating
6/70
Huawei Transport Network Maintenance Reference
(Volume 5)
RTN Microwave Contents
Issue 01 (2011-12-30) Huawei Proprietary and Confidential
Copyright Huawei Technologies Co., Ltd.
iv
1.10 RMON Performance Analysis......... .......... .......... ......... .......... .......... ......... .......... .......... ......... ........... ... 15
1.10.1 Application Scenarios ......... .......... .......... ......... .......... .......... ......... .......... .......... ......... .......... ....... 15
1.10.2 Method Description.................................................................................................................... 15
1.10.3 Application Example ................. .......... ......... .......... .......... ......... .......... .......... .......... ......... .......... 16
1.11 Network Planning Analysis.......... .......... .......... ......... .......... .......... ......... .......... .......... ......... .......... ....... 17
1.11.1 Application Scenarios ......... .......... .......... ......... .......... .......... ......... .......... .......... ......... .......... ....... 17
1.11.2 Method Description .......... .......... .......... ......... .......... ........... ......... .......... .......... ......... .......... ........ 17
1.11.3 Application Example ........ .......... ........... ........ ........... .......... ......... .......... .......... ......... .......... ........ 18
2 Troubleshooting Process and Guide ............................................................................ 21
2.1 Troubleshooting Process Overview .......... .......... .......... ......... .......... .......... ......... .......... .......... ......... ....... 21
2.2 Fault Categories.................................................................................................................................... 23
2.3 Troubleshooting Radio Links...... .......... .......... ......... .......... .......... ......... .......... .......... .......... ......... .......... 23
2.3.1 Radio Link Faults......................................................................................................................... 23
2.3.2 Signal Propagation Faults .......... .......... .......... ......... .......... .......... ......... .......... .......... ......... .......... .. 26
2.4 Troubleshooting TDM Services ............... .......... ......... .......... .......... .......... ......... .......... .......... ......... ....... 27
2.5 Troubleshooting Data Services ...... .......... .......... ......... .......... .......... ......... .......... .......... .......... ......... ....... 28
2.5.1 Services at All Base Stations on an Entire Network or in an Area Are Interrupted.... ............ ........ ..... 28
2.5.2 Services at All Base Stations on an Entire Network or in an Area Experience Packet Loss ................ 30
2.5.3 Services at Some Base Stations in an Area Are Interrupted ............................................................. 32
2.5.4 Services at Some Base Stations in an Area Experience Packet Loss ........ ............ ........ .... ............ ..... 35
2.6 Troubleshooting Microwave Protection ................. .......... ......... .......... .......... ......... .......... .......... .......... ... 37
2.6.1 Switchover Failure or Delay in Microwave 1+1 Protection ............................................................. 37
2.6.2 Failure to Switch to the Main Unit in Microwave 1+1 Protection ........ ........ ............ .... ........ ............ 38
2.6.3 Switchover Failure or Delay in SNCP Protection ... .......... .......... ......... .......... .......... .......... ......... .... 38
2.7 Troubleshooting Clocks......................................................................................................................... 39
2.7.1 Analyzing Clock Faults ...... .......... .......... ......... .......... ........... ........ ........... .......... .......... ......... ........ 39
2.7.2 Handling Common Clock Alarms ......... .......... .......... ......... .......... .......... ......... ........... .......... ......... 40
2.8 Troubleshooting DCN Communication .......... .......... .......... ......... .......... ........... ........ ........... .......... ......... 42
2.8.1 Fault Symptoms and Possible Causes ......... .......... .......... ......... .......... .......... ......... .......... .......... ..... 42
2.8.2 DCN Troubleshooting Process ......... .......... .......... ......... .......... .......... ......... ........... .......... ......... ..... 45
3 Equipment Interworking Guide ........ ........ ........ ........ ........ ........ ........ ........ ......... ........ .. 48
3.1 Interworking Criteria ............................................................................................................................ 48
3.1.1 Interworking Through Ethernet Ports ........... .......... .......... ......... .......... .......... ......... .......... .......... ... 48
3.1.2 Interworking Through SDH Ports.......... .......... .......... ......... .......... ........... ........ ........... .......... ......... 49
3.1.3 Interworking Through PDH Ports.......... .......... .......... ......... .......... ........... ........ ........... .......... ......... 50
3.2 Methods for Locating Interworking Faults... .......... .......... ......... .......... .......... .......... ......... .......... .......... ... 51
4 Typical Cases ................................................................................................................ 52
4.1 List of Cases......................................................................................................................................... 52
4.2 Radio Link Faults ................................................................................................................................. 53
4.2.1 Radio Link Interruptions Due to Multipath Fading ......................................................................... 53
4.2.2 Service Bit Errors Due to Interference to Radio Links .................................................................... 54
-
8/11/2019 Method of Fault Locating
7/70
-
8/11/2019 Method of Fault Locating
8/70
Huawei Transport Network Maintenance Reference
(Volume 5)
RTN Microwave
1 Basic Principles and Common Methods for Locating
Faults
Issue 01 (2011-12-30) Huawei Proprietary and Confidential
Copyright Huawei Technologies Co., Ltd.
1
1 Basic Principles and Common Methodsfor Locating Faults
This chapter describes basic principles and common methods for locating faults. Each method
is illustrated using an example.
1.1 Basic Principles for Locating Faults
Purpose
To locate a fault to a radio site or a radio hop.
Description
Fault locating aims to narrow down the most likely areas for faults, since transmissionequipment faults affect services in a large area.
Table 1-1 lists the basic principles for locating faults. These principles are summarized based
on characteristics of transmission equipment.
Table 1-1Basic principles for locating faults
Basic Principle Description
External first, transmission
next
Rule out external faults, for example, faults on power
supply equipment or interconnected equipment, or cable
damage.
Network first, NE next Locate a fault to a radio site or a radio based on fa ult
symptoms.
High-speed section first,low-speed section next
Alarms of high-speed signals generally cause alarms oflow-speed signals. Therefore, clear faults in the high-speed
section first.
High-severity alarms first,
low-severity alarms next
First handle high-severity alarms, such as critical alarms
and major alarms. Then handle low-severity alarms, suchas minor alarms and warnings.
-
8/11/2019 Method of Fault Locating
9/70
Huawei Transport Network Maintenance Reference
(Volume 5)
RTN Microwave
1 Basic Principles and Common Methods for Locating
Faults
Issue 01 (2011-12-30) Huawei Proprietary and Confidential
Copyright Huawei Technologies Co., Ltd.
2
1.2 Common Methods for Locating FaultsTable 1-2 lists common methods for locating faults. Network faults can be located quickly byusing a combination of these methods. In actual applications, maintenance engineers areexpected to locate and rectify faults quickly by using various fault locating methods.
Figure 1-1Common methods for locating faults
Table 1-2Common methods for locating faults
Method
Applicable Scope
Brief Introduction
Signal flow analysis All scenarios This method helps locate a fault to a radio site or radio
hop. Familiarity with service signal flows, cable
connections, and air-interface link connections helpsanalyze fault symptoms and locate possibly faulty points.
Alarm analysis All scenarios Alarms well illustrate fault information. Handle alarms
reported by faulty points immediately after analyzing
service signal flows.
Receive and transmit
power analysis
Locating radio link
faults
By analyzing the current and historical receive and
transmit power on a radio link, determine whether any
errors, for example, interference and fading, exist on theradio link.
Loopback Locating a fault to a
component or site
section by section
This method is fast and independent of alarm and
performance event analysis. It , however, affects embedded
control channels (ECCs) and normal service running.
Replacement Locating a fault to a
component or board,
or identifying
external faults
This method does not require sound theoretical knowledge
or skills but requires spare parts. It applies to nearly sites.
-
8/11/2019 Method of Fault Locating
10/70
Huawei Transport Network Maintenance Reference
(Volume 5)
RTN Microwave
1 Basic Principles and Common Methods for Locating
Faults
Issue 01 (2011-12-30) Huawei Proprietary and Confidential
Copyright Huawei Technologies Co., Ltd.
3
Method
Applicable Scope
Brief Introduction
Configuration dataanalysis
Locating servicefaults when both
hardware and radio
links work normally
This method covers configuration data analysis,configuration data modification, and modification
verification, and therefore has high requirements for
maintenance personnel.
Tests using instrumentsand tools
Isolating externalfaults and addressing
interworking issues
This method provides accurate results. Before using thismethod, interrupt services.
RMON performanceanalysis
Locating faults indata services
Statistics are collected routinely to analyze Ethernet boardinformation, for example, service performance.
Network planning analysis Diagnosing
performance
deterioration andfrequent interruption
of radio links
This method addresses availability issues of radio links. It
requires analysis of planned parameters such as fading
margin and of measures against multipath fading.
Experience-based fault
handling
Special scenarios With rich troubleshooting experience, you can locate
faults quickly by analyzing fault symptoms and networkarchitecture.
1.3 Signal Flow Analysis
1.3.1 Application ScenariosSignal flow analysis is commonly used to locate faults. It helps much in scenarios where
multiple network elements (NEs) become unreachable to the network management system(NMS) or multiple points are faulty in base station services.
1.3.2 Method Description
Based on network connection diagrams, logical service relationships, and system functional
block diagrams, this method allows you to analyze service f low directions to obtain possibly
faulty points and locate those faulty ones.
Use this method if you need to locate a fault to a site or link on a network or locate a fault to amodule.
1.3.3 Application Example
Fault Symptoms
As shown inFigure 1-2,a microwave chain network was set up, and all 2G and 3G basestation services in an area were interrupted for approximate ly 10 minutes.
-
8/11/2019 Method of Fault Locating
11/70
Huawei Transport Network Maintenance Reference
(Volume 5)
RTN Microwave
1 Basic Principles and Common Methods for Locating
Faults
Issue 01 (2011-12-30) Huawei Proprietary and Confidential
Copyright Huawei Technologies Co., Ltd.
4
Figure 1-2Network example for signal flow analysis
Area where services were interrupted
Backhaul signal flow
NE1701NE1702NE1703
NE1704
NE1709NE1710NE1711NE1712
NE1708 NE1707 NE1706 NE1705
Cause Analysis and Handling Procedure
Step 1 Checked the distribution of the NEs on which services were interrupted and the service flowdirection.
NE1704 converged the interrupted services, so the service interruption was related to
NE1704.
Step 2 Checked alarms and operation records on NE1704.
NE1704 reported an MW_CFG_MISMATCH a larm, and the Hybrid radio E1 capacity was
changed on NE1704 right before the services were interrupted. It was inferred that theservices were interrupted due to an E1 capacity mismatch between NE1704 and NE1705.
Step 3 Corrected the Hybrid radio E1 capacity on NE1704.
The fault was rectified.
----End
Conclusions and Suggestions
If services are interrupted at multiple points, signal flow analysis generally proves that their
convergence point is faulty.
1.4 Alarm and Performance Analysis
1.4.1 Application Scenarios
Alarms well illustrate fault information. When a fault occurs, first check the alarms reported
by possibly faulty equipment.
Checking current and historical alarms, fault symptoms, and fault time helps narrow down the
most likely areas for faults, and helps locate a fault to a hop, site, or module.
-
8/11/2019 Method of Fault Locating
12/70
Huawei Transport Network Maintenance Reference
(Volume 5)
RTN Microwave
1 Basic Principles and Common Methods for Locating
Faults
Issue 01 (2011-12-30) Huawei Proprietary and Confidential
Copyright Huawei Technologies Co., Ltd.
5
The alarm and performance analysis method entails capabilities in using the NMS and
analyzing service signal flows.
1.4.2 Method Description
Step 1 Use the NMS to obtain information about equipment alarms and performance events on anentire network.
Step 2 Sort the alarms by severity and handle the alarms in the following sequence:
1. Hardware alarms, such as HARD_BAD, BD_STATUS, and VOLT_LOS
2. Link alarms, such as IF_CABLE_OPEN, MW_LOF, RADIO_RSL_LOW, R_LOC,
R_LOF, and R_LOS
3. Service alarms
4. Protection alarms, such as HSB_INDI, RPS_INDI, XCP_INDI, and APS_INDI
----End
1.4.3 Application Example
Fault Symptoms
An OptiX RTN 620 NE on a network reported a HARD_BAD alarm and an XCP_INDIalarm.
Cause Analysis and Handling Procedure
Step 1 Checked a larms.
Boards in slots 1, 5, 6, and 7 reported the HARD_BAD alarm.
The PXC board in slot 1 reported a HARD_BAD alarm, whose parameters indicated that
the 38M clock was lost and the analog phase-locked loop (PLL) was unlocked.
The boards in slots 5, 6, and 7 reported the HARD_BAD alarm, whose parameters
indicated that the 38M clock was lost and the PXC board in slot 1 was faulty. The faultcaused loss of the first 38M clock.
Step 2 Checked the XCP_INDI alarm.
The HARD_BAD alarm reported by the board in slot 1 triggered a switchover, causing theSCC board to report an XCP_INDI alarm.
Step 3 Replaced the PXC board in slot 1.
The alarms cleared.
----End
Conclusions and Suggestions
If an NE simultaneously reports multiple alarms, analyze their severities, correlations, and
parameters so you can quickly locate the fault to a board or port.
-
8/11/2019 Method of Fault Locating
13/70
Huawei Transport Network Maintenance Reference
(Volume 5)
RTN Microwave
1 Basic Principles and Common Methods for Locating
Faults
Issue 01 (2011-12-30) Huawei Proprietary and Confidential
Copyright Huawei Technologies Co., Ltd.
6
1.5 Receive and Transmit Power Analysis
1.5.1 Application Scenarios
Receive and transmit power analysis is crucial to radio link analysis. This method allows youto determine whether any faults, for example, radio link blocking, fading, and outdoor unit
(ODU) faults, occur on a link by analyzing current and historical receive and transmit poweron the link, thereby quickly locating the fault.
1.5.2 Method Description
This method allows you to check the receive and transmit power on a link, as well as their
changes using the NMS.
By periodically updating the receive and transmit power table based on radio link directions
and network design, you can identify the links whose receive power or transmit power is more
than 3 dB higher or lower than the designed value, and then take appropriate measures in atimely manner.
1.5.3 Application Example
Fault Symptoms
On an OptiX RTN 600, a 20 km long cross-ocean 1+1 hot standby (HSB) radio link wasinterrupted intermittently, and alarms such as B1_SD, HSB_INDI, MW_LOF, and R_LOF,
were reported and lasted several seconds to dozens of seconds.
Cause Analysis and Handling Procedure
Step 1 Checked the ODU receive power that was recorded during the alarm period.
-
8/11/2019 Method of Fault Locating
14/70
Huawei Transport Network Maintenance Reference
(Volume 5)
RTN Microwave
1 Basic Principles and Common Methods for Locating
Faults
Issue 01 (2011-12-30) Huawei Proprietary and Confidential
Copyright Huawei Technologies Co., Ltd.
7
The difference between the maximum receive power and the minimum receive power was
more than 40 dB, and the minimum receive power was close to or less than the receiver
sensitivity. Therefore, it was inferred that the fault was caused by spatial fading.
Step 2 Checked the network planning design.
The ODU operated at the 8 GHz band, which was less prone to rain fading, and therefore
multipath fading caused intermittent link interruptions. In addition, 1+1 HSB protection does
not well protect radio links against mult ipath fading.
Step 3 Replaced 1+1 HSB protection with 1+1 space diversity (SD) protection.
----End
Conclusions and Suggestions
Routinely check whether the receive power reaches the designed value. If not, it is
recommended that you check the configuration, adjust antennas, or replace ODUs so the
receive power reaches the designed value. Minimize the impact of multipath fading by using one of the following methods,
depending on the actual conditions:
Use low capacity, low-order modulation schemes, and low bandwidths.
Increase the height difference between antennas at both ends providing that
line-of-sight (LOS) is guaranteed.
Add two antennas and configure an SD protection group.
1.6 Loopback
1.6.1 Application Scenarios
After a loopback is enabled at a point, signals that should be forwarded in normal cases are
routed to the signal source. If services are interrupted, loopbacks can be performed to narrow
down fault areas by checking whether each network section is in good condit ion.
Loopbacks can be software loopbacks or hardware loopbacks. Software loopbacks can beinloops or outloops. For detailed loopback definitions, operation methods, and usage
restrictions, see theMaintenance Guide .
1.6.2 Method Description
This method allows you to narrow down fault areas by performing loopbacks at different
points and testing services.
Narrowing Down Fault Areas
As shown inFigure 1-3,point A failed to pass a loopback test, and point B passed a loopback
test. Then, the fault existed between point B and point A.
-
8/11/2019 Method of Fault Locating
15/70
Huawei Transport Network Maintenance Reference
(Volume 5)
RTN Microwave
1 Basic Principles and Common Methods for Locating
Faults
Issue 01 (2011-12-30) Huawei Proprietary and Confidential
Copyright Huawei Technologies Co., Ltd.
8
Figure 1-3Loopbacks helping narrow down fault areas
AB Faulty section
Test result
ERR
OK
ERR
AB
Test meter
Equipment/
Board 1
Equipment/
Board 2
Equipment/
Board 3
Equipment/
Board 4
Equipment/
Board 5
Diagnosing Equipment Interworking Faults
If all sections on the entire network pass loopback tests but the entire network fails in the test,an equipment interworking fault may occur. SeeFigure 1-4.
Figure 1-4Loopbacks for diagnose equipment interworking faults
Test result
ERR
OK
Test meter
OK
Check for an equipment interworking fault.
Test meterEquipment 1 Equipment 2 Equipment 3 Equipment 4 Equipment 5
-
8/11/2019 Method of Fault Locating
16/70
Huawei Transport Network Maintenance Reference
(Volume 5)
RTN Microwave
1 Basic Principles and Common Methods for Locating
Faults
Issue 01 (2011-12-30) Huawei Proprietary and Confidential
Copyright Huawei Technologies Co., Ltd.
9
1.6.3 Application Example
Fault Symptoms
Figure 1-5 shows a network, where an E1 tributary between the radio network controller(RNC) and third-party equipment reported an a larm.
Figure 1-5Loopbacks for locating faults
OSN
Third-party
SDH E1 BER tester
7 IFH2
1 PXC
3 PXC
5 IFH2
8
2 SCC
4
6 SD1
7 IFH2
1 PXC
3 PXC
5 IFH2
8
2 SCC
4 PH1
6 SD1
NE1
ODU
ODU
ODU
ODU
NE2
2
RNC
1
A
B
Cause Analysis and Handling Procedure
Step 1 Analyzed the service s ignal flow.
The alarmed E1 signal was received from NE2.
Step 2 Checked alarms reported by NE2.
NE2 did not report any hardware a larms or service alarms.
Step 3 Set an inloop at the tributary board (point 1) on NE2, and connected an E1 bit error rate (BER)tester to point A (third-party SDH equipment).
The service had bit errors.
Step 4 Set an outloop at the SD1 board (point 2) on NE1.
The E1 BER tester at point A read no bit error. It was suspected that the radio link between
NE1 and NE2 was faulty.
Step 5 Tested the radio link performance by setting an inloop at the tributary board (point 1) on NE2and connecting an E1 BER tester to point B (OptiX OSN equipment).
The E1 BER tester at point B read no bit error.
-
8/11/2019 Method of Fault Locating
17/70
-
8/11/2019 Method of Fault Locating
18/70
Huawei Transport Network Maintenance Reference
(Volume 5)
RTN Microwave
1 Basic Principles and Common Methods for Locating
Faults
Issue 01 (2011-12-30) Huawei Proprietary and Confidential
Copyright Huawei Technologies Co., Ltd.
11
1.7.3 Application Example
Fault Symptoms
See the following figure. Two sites, site A and site B, were interconnected using 2+0 radiolinks. At each site, ODUs of the same type (with the same sub-band but different working
frequencies) were used. NE B-2 at site B frequently reported services alarms such as R_LOC
and R_LOF.
Figure 1-6Replacement for locating faults
NE A-1
NE A-2
ODU
A-1
ODU
A-2
NE B-1
NE B-2
ODU
B-1
ODU
B-2
Site A Site B
R_LOC/ R_LOF
Cause Analysis and Handling Procedure
Step 1 Checked historical performance events and the receive power within the period of alarmreporting.
The receive power was normal.
Step 2 Interchanged the IF cables at site B and checked for alarms for two days.
NE B-2 st ill reported service alarms. Therefore, site B was not faulty, and site A was possibly
faulty.
NE A-1
NE A-2
ODU
A-1
ODU
A-2
NE B-1
NE B-2
ODU
B-1
ODU
B-2
Site A Site B
R_LOC/ R_LOF
Step 3 Restored the IF cable connections at site B, interchanged the IF cables at site A, and checkedfor alarms for two days.
NE B-1 reported service alarms. Therefore, the IF ca ble connecting NE A-2 and ODU A-2
was faulty.
-
8/11/2019 Method of Fault Locating
19/70
Huawei Transport Network Maintenance Reference
(Volume 5)
RTN Microwave
1 Basic Principles and Common Methods for Locating
Faults
Issue 01 (2011-12-30) Huawei Proprietary and Confidential
Copyright Huawei Technologies Co., Ltd.
12
NE A-1
NE A-2
ODU
A-1
ODU
A-2
NE B-1
NE B-2
ODU
A-1
ODU
A-2
Site A Site B
R_LOC/ R_LOF
Step 4 Replaced the faulty IF cable. The fault was rectified.
----End
Conclusions and Suggestions
If common methods fail to locate the causes of similar problems, you can replace involved
parts one by one.
1.8 Configuration Data Analysis
1.8.1 Application Scenarios
Incorrect operations or inherent characteristics (for example, at a micro level) of electronic
equipment may corrupt or change equipment's configuration data (for example, NE data and
board data), leading to faults like service interruptions. After locating faults to boards, you cananalyze configuration data to further locate the faults.
1.8.2 Method Description
This method allows you to query equipment's configuration data, compare the data with
planned data, and analyze the data based on networking topologies and equipment
interconnections.
Radio hop configurations must comply with the following rules:
Each of the following parameters must be consistently set at both ends of a radio hop:
microwave working modes of IF boards (channel spacing, IF bandwidth, and modulation
scheme), number of E1s for Hybrid radio, and IEEE 1588 timeslots. The transmit and receive frequencies of ODUs must be set correctly. To be specific, there
is a T/R spacing between the transmit frequencies of the Tx high and Tx low sites for a
radio hop. That is, for a radio hop, the transmit frequency of the Tx high site must be
equal to the receive frequency of the Tx low site, and the transmit frequency of the Txlow site must be equal to the receive frequency of the Tx high site.
1.8.3 Application Example
Fault Symptoms
After an OptiX RTN 600 NE was configured, it operated normally. Its services, however,
were interrupted after it restarted after a power failure.
-
8/11/2019 Method of Fault Locating
20/70
Huawei Transport Network Maintenance Reference
(Volume 5)
RTN Microwave
1 Basic Principles and Common Methods for Locating
Faults
Issue 01 (2011-12-30) Huawei Proprietary and Confidential
Copyright Huawei Technologies Co., Ltd.
13
Cause Analysis and Handling Procedure
Step 1 Checked a larms.
The CONIFG_NOSUPPORT alarm indicating an incorrect frequency caused the
RADIO_MUTE alarm.
Step 2 Checked the parameter setting.
The preset Tx frequency was out of the Tx frequency range.
NOTE If an incorrect Tx frequency value is applied to an unmuted ODU, the ODU reports a
CONFIG_NOSUPPORT alarm but remains in the unmute state, so its services are not interrupted. After
the Tx frequency is changed to a correct one, the CONFIG_NOSUPPORT a larm auto matically clears .
However, if an incorrect Tx frequency value is applied to an ODU after the ODU is reset or powered off
or the NE is reset, the ODU remains in the mute s tate and so its services cannot be restored.
Step 3 Changed the Tx frequency to a correct value based on the network planning information.
The fault was rectified.
----End
Conclusions and Suggestions
If services are interrupted due to incorrect operations, check whether the configuration data iscorrect. In addition, analyzing alarms and their parameters help locate configuration errors.
1.9 Tests Using Instruments and Tools
1.9.1 Application Scenarios
This method is used to locate equipment interworking faults and to test performance
indicators.
1.9.2 Method Description
Tools such as multimeters, SDH analyzers, SmartBits, and data service packet sniffers are
used to test equipment on live networks and to check whether faults are caused by equipment
faults or external factors.
-
8/11/2019 Method of Fault Locating
21/70
Huawei Transport Network Maintenance Reference
(Volume 5)
RTN Microwave
1 Basic Principles and Common Methods for Locating
Faults
Issue 01 (2011-12-30) Huawei Proprietary and Confidential
Copyright Huawei Technologies Co., Ltd.
14
1.9.3 Application Example
Fault Symptoms
In the network shown in following figure, the NMS set up data communication network(DCN) communication with NE1 and NE2 through the multiprotocol label switching (MPLS)
network. NE1 was connected to the MPLS network using a hub and communicated with the
MPLS network through the Open Shortest Path First (OSPF) protocol. The NMS pinged NE1
successfully but failed to ping NE2. Therefore, NMS could not reach NE2. The routing tableof NE1 indicated that NE1 did not learn routes to upstream NEs. The MPLS network had
multiple radio hops at its edge, but the fault occurred only between NE1 and NE2.
Figure 1-7Fault example
NE1 NE2
MPLS
HUB
NOTE
NE1 and NE2 formed a radio hop.
Cause Analysis and Handling Procedure
Step 1 Connected the hub to a PC and used the data service packet sniffer to analyze the OSPFpackets received by NE1.
The designated router (DR) IP addresses in the OSPF packets were xx.xx.xx.1, but the IP
address of the NE that sent the DR packets was xx.xx.xx.2. Therefore, NE1 did not receive
any DD packets sent by the DR elected on the OSPF subnet. As a result, NE1 could not createan adjacency with the DR and could not learn OSPF routes.
Step 2 Sniffed and analyzed OSPF packets at another OptiX RTN NE that was connected to theMPLS network and was operating normally.
The OptiX RTN NE received OSPF packets from the DR. Therefore, an OptiX RTN NE fault
was ruled out.
Step 3 Increased the priority of NE1's gateway (IP address: xx.xx.xx.2) so the gateway became theDR on the subnet.
NE1 learned OSPF routes, and NE2 was reac hable to the NMS.
----End
-
8/11/2019 Method of Fault Locating
22/70
Huawei Transport Network Maintenance Reference
(Volume 5)
RTN Microwave
1 Basic Principles and Common Methods for Locating
Faults
Issue 01 (2011-12-30) Huawei Proprietary and Confidential
Copyright Huawei Technologies Co., Ltd.
15
Conclusions and Suggestions
This method helps to locate equipment interworking faults or data service faults.
1.10 RMON Performance Analysis
1.10.1 Application Scenarios
If remote monitoring (RMON) is enabled on the NMS, you can perform Ethernet OAM
functions, loopbacks, and ping tests to locate service interruptions or performancedeterioration.
1.10.2 Method Description
RMON can be used to transfer network monitoring data between network segments. RMON
achieves the following funct ions:
Storing all the statistics on the agent side and supporting offline manager operations
Storing historical data to facilitate fault diagnosis
Supporting error detection and reporting
Supporting multiple manager sites
The OptiX RTN equipment achieves RMON using the following management groups:
Ethernet statistics group
The Ethernet statistics group queries real-time Ethernet port performance statistics of.
Ethernet history groupThe Ethernet history group stores historical Ethernet performance statistics so users can
obtain Ethernet performance of a specific Ethernet port within a historical period. The
Ethernet history group supports the same items as the Ethernet statistics group.
Ethernet history control group
The Ethernet history control group specifies how to obtain historical Ethernet port
performance data.
Ethernet alarm group
The Ethernet alarm group reports an alarm if the value of a monitored item crosses the
preset threshold.
RMON covers the following statistical items:
Number of transmitted packets
Number of transmitted bytes
Number of rece ived packets
Number of rece ived bytes
Number of each type of bad packets
Number of discarded packets
-
8/11/2019 Method of Fault Locating
23/70
Huawei Transport Network Maintenance Reference
(Volume 5)
RTN Microwave
1 Basic Principles and Common Methods for Locating
Faults
Issue 01 (2011-12-30) Huawei Proprietary and Confidential
Copyright Huawei Technologies Co., Ltd.
16
1.10.3 Application Example
Fault Symptoms
Figure 1-8 shows a mobile network, where OptiX RTN 600 V100R003s provided backhaultransmission. Packet loss occurred when BTS1 at site 1 and BTS2 at site 2 were pinged from
the RNC, but did not occur when BTS3 at site 3 was pinged.
Figure 1-8Network example for RMON performance analysis
1-001 2-001 2-002 3-001 3-002 4-001
Site 1 Site 2 Site 3
RNC
BTS1 BTS2 BTS3
Cause Analysis and Handling Procedure
Step 1 Analyzed the RMON data of NE 3-002 to check whether packet loss was caused byinsufficient radio bandwidth between site 2 and site 3.
The maximum traffic volume of NE 3-001 already reached its maximum air interface
bandwidth (25 Mbit/s). Therefore, packet loss was caused by congestion. For details, see the
following f igure.
Step 2 Changed the air interface capacity of NE 3-001 as required.
Step 3 Performed a ping test.
No packet was lost.
-
8/11/2019 Method of Fault Locating
24/70
Huawei Transport Network Maintenance Reference
(Volume 5)
RTN Microwave
1 Basic Principles and Common Methods for Locating
Faults
Issue 01 (2011-12-30) Huawei Proprietary and Confidential
Copyright Huawei Technologies Co., Ltd.
17
----End
Conclusions and Suggestions
RMON shows data traffic and air interface bandwidth usage graphically.
1.11 Network Planning Analysis
1.11.1 Application Scenarios
Network planning is crucial to radio link performance. To address availability issues not
caused by equipment faults, such as bit errors on radio links and frequent interruptions of
radio links, check whether correct methods are used during network planning or whethernetwork planning is based on actua l link conditions.
Based on terrains and rain falling of areas that radio links cover, network planning generally
determines operating frequencies, T/R spacing, transmit power, antenna heights, and
protection/diversity modes. Based on the preceding information, radio link indicators such asnormal receive power, fading margin, and system availability can be obtained.
1.11.2 Method Description
The following items are often checked:
Availability: Check whether actual link availability meets customers' requirements. Forrain zones (zones L, M, N, P, and Q specified by ITU-T), it is recommended that you use
low frequency bands and polarization direction V. For a radio link subject to severe
multipath fading, it is recommended that you increase the height difference between theantennas at both ends or use 1+1 SD protection as long as LOS is guaranteed.
Multipath fading prediction methods. Generally, the following methods are available:
ITU-R-P.530-7/8 method: It is globally applicable.
-
8/11/2019 Method of Fault Locating
25/70
Huawei Transport Network Maintenance Reference
(Volume 5)
RTN Microwave
1 Basic Principles and Common Methods for Locating
Faults
Issue 01 (2011-12-30) Huawei Proprietary and Confidential
Copyright Huawei Technologies Co., Ltd.
18
ITU-R-P.530-9 method: It is applicable to areas with high reflection gradients, for
example, the Middle East, the Mediterranean sea, and West Africa. It works with the
ITU-R-P.530-7/8 method. During the prediction, a low availability is used as the
calculation result.
KQ factor method: It is applicable to China (seldom used).
Vigants-Barnett method: It is applicable to North America.
Rain fading prediction methods. Generally, the following methods are available:
ITU-7: It is globally applicable.
R.K. Crane: It is applicable to North America.
For a link covering several rain zones, it is recommended that you select the zonewith the heaviest rainfall for calculation.
1.11.3 Application Example
Fault SymptomsA radio link frequently but intermittently reported MW_RDI, R_LOC, and RPS_INDI alarms,
and HSB switchovers were triggered.
Table 1-3Link information
Protection 1+1 HSB
IF board IF1B boards
IF mode IF mode 7 (28M/128QAM/STM-1)
ODU type SPA ODUs operating at the 8 GHz frequency band
Receiver sensitivity 70.5 dBm
Transmit power 20 dBm
Receive power 39.5 dBm
Planned availability 99.994%
Predicted annual interrupt ion time 1877 seconds
Cause Analysis and Handling Procedure
Step 1 Queried historical receive power values of the radio link.
The receive power decreased to a value close to the receiver sensitivity when an alarm was
reported. Most alarms were reported during the night or in the early morning. When theweather was favorable at noon, the receive power was normal. Therefore, intermittent radio
link interruptions were caused by multipath fading.
Step 2 Checked annual interruption time predicted for the radio link.
The actual annual interruption time was longer than the predicted time of 1877 seconds.
Therefore, the fading margin was insufficient.
Step 3 Checked the network planning methods.
-
8/11/2019 Method of Fault Locating
26/70
Huawei Transport Network Maintenance Reference
(Volume 5)
RTN Microwave
1 Basic Principles and Common Methods for Locating
Faults
Issue 01 (2011-12-30) Huawei Proprietary and Confidential
Copyright Huawei Technologies Co., Ltd.
19
The ITU-R-P.530-7/8 method was used. The area covered by the radio link was in the Middle
East, and therefore the ITU-R-P.530-9 method should be used.
Step 4 Used the ITU-R-P.530-9 method to predict annual interruption time without changing otherconditions.
The obtained value was about 175833 seconds, which was longer than the value obtained
using the ITU-R-P.530-7/8 method.
Figure 1-9Using the ITU-R-P.530-7/8 method
Figure 1-10Using the ITU-R-P.530-9 method
Step 5 Deleted 1+1 HSB protection settings and configured 1+1 SD protection. The link availability
met service requirements.
-
8/11/2019 Method of Fault Locating
27/70
Huawei Transport Network Maintenance Reference
(Volume 5)
RTN Microwave
1 Basic Principles and Common Methods for Locating
Faults
Issue 01 (2011-12-30) Huawei Proprietary and Confidential
Copyright Huawei Technologies Co., Ltd.
20
----End
Conclusions and Suggestions
Network planning is crucial to radio link performance. For radio links that are frequentlyinterrupted due to fading, it is recommended that you first check their network planning
information.
-
8/11/2019 Method of Fault Locating
28/70
Huawei Transport Network Maintenance Reference
(Volume 5)
RTN Microwave 2 Troubleshooting Process and Guide
Issue 01 (2011-12-30) Huawei Proprietary and Confidential
Copyright Huawei Technologies Co., Ltd.
21
2 Troubleshooting Process and GuideThis chapter describes the general troubleshooting process, fault categories, and how to
diagnose each category of faults.
2.1 Troubleshooting Process Overview
Figure 2-1Troubleshooting flowchart
Start
Record fault symptoms
Diagnose the fault
Is the fault rectified?
Report to Huawei
Work out solutions
together
Is the fault rectified?
Write a troubleshooting
report
End
Rectify external faults
1
34
No
Yes
Yes
No
No
Caused by
external factors?2
Yes
-
8/11/2019 Method of Fault Locating
29/70
Huawei Transport Network Maintenance Reference
(Volume 5)
RTN Microwave 2 Troubleshooting Process and Guide
Issue 01 (2011-12-30) Huawei Proprietary and Confidential
Copyright Huawei Technologies Co., Ltd.
22
Table 2-1Remarks about the troubleshooting process
Mark
Explanation
1 When recording fault symptoms, record them as detailed as possible.
Record other important information too, for example, exact time when thefault occurs, operations performed before and after the fault occurs, alarms,
and performance events.
You can collect fault data using the Data Collector (DC) tool that is
integrated with the U2000.
2 External factors include power supply, fibers/cables, environment, and
terminal equipment (such as switches).
3 Find causes of a fault with reference to section1.2 "Common Methods for
Locating Faults", determine the category of the fault with reference to
section2.2 "Fault Categories", and rectify the fault as instructed in the
corresponding section listed below:
2.3 Troubleshooting Radio Links
2.4 Troubleshooting TDM Services
2.5 Troubleshooting Data Services
2.6 Troubleshooting Microwave Protection
2.7 Troubleshooting Clocks
2.8 Troubleshooting DCN Communication
4 Contact Huawei local office or dial Huawei technical service hotline for
problem reporting and technical support.
CAUTION
When handling critical problems such as a service interruption, exercise the followingprecautions:
Restore services as soon as possible.
Analyze fault symptoms, find causes, and then handle faults. If causes are unknown,exercise precautions when you perform operations in case the problems become severer.
If a fault persists, contact Huawei engineers and coordinate with them to handle the fault
promptly. Record the operations performed during fault handling and save the original data related to
the fault.
-
8/11/2019 Method of Fault Locating
30/70
Huawei Transport Network Maintenance Reference
(Volume 5)
RTN Microwave 2 Troubleshooting Process and Guide
Issue 01 (2011-12-30) Huawei Proprietary and Confidential
Copyright Huawei Technologies Co., Ltd.
23
2.2 Fault Categories
Table 2-2Fault categories
Fault Category
Typical Symptom
Hardware fault Equipment reports hardware alarms such as BD_STATUS
and HARD_BAD.
Radio link fault Radio links report link-related alarms such as MW_LOF and
RADIO_RSL_LOW, or have bit errors.
Time division
multiplexing (TDM)service fault
Radio links work normally but their carried TDM services are
interrupted or deteriorate.
Data service fault Radio links work normally but their carried data services
have packet loss or are unavailable.
Protection fault Protected radio links or their carried services are faulty, orprotection switching fails (no switchover is performed or
services are unavailable after switching is complete).
Clock fault NEs report clock alarms.
DCN fault NEs fail to be managed by the NMS or do not respond to
commands from the NMS.
2.3 Troubleshooting Radio Links
2.3.1 Radio Link Faults
Fault Causes
Causes of radio link faults are classified into the following categories:
Equipment faults, including indoor unit (IDU) faults, outdoor (ODU) faults, and power
faults
Propagation faults, including fading, interference, and poor LOS
Poor construction quality, including poor antenna/component installation, poor
grounding, and poor waterproofing
-
8/11/2019 Method of Fault Locating
31/70
Huawei Transport Network Maintenance Reference
(Volume 5)
RTN Microwave 2 Troubleshooting Process and Guide
Issue 01 (2011-12-30) Huawei Proprietary and Confidential
Copyright Huawei Technologies Co., Ltd.
24
Figure 2-2Causes of radio link faults
Causes of radio
link faults
Propagation faults
Interference Fading Poor LOS
External
interference
Over-reach
interference
Rain fading
Multipath
fading
Reflection
LOS not
achieved
Near-field
blocking
Poor construction
quality
Antenna
installationCables
Antennas not
aligned
Antennasloosened or offset
Poor grounding
Poor
waterproofing
Equipment faults
IDU faults
ODU or outdoor
component faults
Power faultsDamaged cable
components
-
8/11/2019 Method of Fault Locating
32/70
Huawei Transport Network Maintenance Reference
(Volume 5)
RTN Microwave 2 Troubleshooting Process and Guide
Issue 01 (2011-12-30) Huawei Proprietary and Confidential
Copyright Huawei Technologies Co., Ltd.
25
Troubleshooting Process
Figure 2-3 illustrates the process for diagnosing a radio link fault.
Figure 2-3Process for diagnosing a radio link fault
Start
Hardware alarms exist?
RSL less than the designed
value if no fading occurs?
RSL greater than the receiver
sensitivity?
Raining when the fault occurs?
The fault occurs regularly?
Rectify equipment faults.
Link interruption time
greater than the
designed value?
Co-channel or adjacent-channel
interference occurs.
Large-delay, multipath reflection occurs.
The link is blocked.
The antennas are offset.
Passive components like hybrid couplers
or flexible waveguides are faulty.
Rain fading
Multipath fading
Terrain reflection
The link reports link-relatedalarms like MW_LOF or bit error
events like UAS/SES?
Handle the fault accordingly.
Yes
No
Yes
No
No
Yes
Yes
No
Yes
No
Yes
No
Check whether the designed value is
appropriate.
Yes
Analyze the configuration data and
replace components that are suspected
to be faulty.
No
-
8/11/2019 Method of Fault Locating
33/70
Huawei Transport Network Maintenance Reference
(Volume 5)
RTN Microwave 2 Troubleshooting Process and Guide
Issue 01 (2011-12-30) Huawei Proprietary and Confidential
Copyright Huawei Technologies Co., Ltd.
26
2.3.2 Signal Propagation Faults
For causes of radio link faults, faults caused by active equipment are easily located because
they generally occur with alarms, whereas signal propagation faults (including faults caused
by unaligned antennas), which occur frequently, are difficult to locate.
Table 2-3provides typical symptoms of and solutions to s ignal propagation faults.
Table 2-3Typical symptoms of and solutions to signal propagation faults
FaultType
Typical Symptom
Solution
Multipathfading
The receive power changes greatly andquickly (generally from 10 dB to dozens of
dB within seconds). The changes occur
periodically, especially during the transition
between day and night. A typical symptom of duct-type fading is
that the receive power undergoes substantial
up-fading and down-fading.
Increase the path inclination by adjusting theantenna mount heights at both ends,
therefore increasing height differences
between the antennas at both ends.
Reduce surface reflection. For apparentstrong reflection surfaces, for example, large
areas of water, flat lands, and bold mountain
tops, adjust antennas to move reflection
points out of the strong reflection areas or
mask the reflection by using landforms.
Reduce the path clearance. With LOS
conditions guaranteed, lower antenna mount
heights as much as possible.
Use space diversity or increase the fading
margin. In normal conditions, space diversity
is the most efficient method for decreasing
multipath fading.
Interference A link's receive power is greater than the
receiver sensitivity, but the link is
interrupted or has bit errors.
When no fading occurs, an IF board reports aradio link alarm, especially when
interference is strong.
When interference occurs at the local end
(interference signal power greater than 90dBm), the local receive power is greater than
90 dBm after the peer ODU is muted.
A frequency scanner can detect interference
signal power when being tuned to theoperating band of an ODU.
Plan frequencies or polarization directions
properly. In theory, a large spacing between
the operating frequency of target signals and
the operating frequency of interferencesignals reduces interference. Meanwhile,
note issues such as frequency resources and
network-wide planning.
Plan Tx high and Tx low sites properly. Ifmultiple ODUs provide multiple microwave
directions at a site, plan the site as a Tx high
site or Tx low site for all microwavedirections, if possible.
Plan microwave routes properly. Generally,
adopt Z-shaped radio link distribution to
prevent over-reach interference.
-
8/11/2019 Method of Fault Locating
34/70
Huawei Transport Network Maintenance Reference
(Volume 5)
RTN Microwave 2 Troubleshooting Process and Guide
Issue 01 (2011-12-30) Huawei Proprietary and Confidential
Copyright Huawei Technologies Co., Ltd.
27
Fault
Type
Typical Symptom
Solution
Rain fading When it rains, a link may be interrupted or
deteriorate.
Increase link fading margin, use low frequency
bands, or use vertical polarization.
Increase link fading margin for rain zones L,
M, N, P, and Q.
Rain fading impairs radio links that operate
at high frequency bands, especially
frequency bands higher than 18 GHz. Radio
links operating at frequency bands lowerthan 10 GHz are not affected. If rain fading
is severe, change radio links' operating
frequency bands, if necessary.
Rain fading in horizontal polarization isseverer than that in vertical polarization.
Poor LOS The receive power is always lower than the
designed power.
If radio links or antennas are blocked, adjust
antenna mount heights or positions to bypass
obstacles.
Adjust deviated antennas.
2.4 Troubleshooting TDM Services
Fault Symptoms
TDM services are interrupted or have bit errors.
Cause Analysis and Handling Procedure
No.
Possible Cause
Handling Procedure
Cause 1 The hardware is faulty. Analyze alarms and perform loopbacks to check whether board
hardware is faulty. If a board is faulty, replace the board.
Cause 2 A radio link is faulty. On the NMS, find the occurrence period of the fault and check whether
any service alarm is generated on the radio link. If a radio link alarm isgenerated, first rectify radio link faults.
Cause 3 Services are incorrectly
configured.
Check whether anMW_CFG_MISMATCH alarm is generated on the
link. Verify that the number of E1s is the same at both ends of the link.
Cause 4 The temperature of a
board is very high.
On the NMS, query the temperatures of components setting up the
faulty link, and check whether any temperature alarm is generated. If
the ODU temperature is very high, take temperature control measures,for example, installing a sunshade. If the IDU temperature is very high,
verify that temperature control devices, for example, air conditioners,
work normally, and verify that the exhaust vents of the IDU are covered
or obstructed.
http://localhost:7890/pages/31184685/01/31184685/01/resources/alm/topic/mw_cfg_mismatch.htmlhttp://localhost:7890/pages/31184685/01/31184685/01/resources/alm/topic/mw_cfg_mismatch.html -
8/11/2019 Method of Fault Locating
35/70
Huawei Transport Network Maintenance Reference
(Volume 5)
RTN Microwave 2 Troubleshooting Process and Guide
Issue 01 (2011-12-30) Huawei Proprietary and Confidential
Copyright Huawei Technologies Co., Ltd.
28
No.
Possible Cause
Handling Procedure
Cause 5 Power supply voltage
fluctuates, the grounding
is improper, or externalinterference exists.
Check whether the voltage of the external input power supply fluctuates
or whether the equipment is grounded improperly.
2.5 Troubleshooting Data ServicesThis section describes how to diagnose data service faults with different symptoms and
affected scopes.
2.5.1 Services at All Base Stations on an Entire Network or in anArea Are Interrupted
Fault Symptoms
On a network, services at all base stations, which are converged at level 1 or level 2
convergence nodes and then transmitted to base station controllers (BSCs)/RNCs, are
interrupted. To be specific, all voice services, Internet access services, and video services areinterrupted.
Cause Analysis
If services at all base stations on an entire network or in an area are interrupted, faultsprobably occur at the convergence nodes that are interconnected with BSCs/RNCs. Therefore,
check for the following faults at convergence nodes:
Board hardware fa ult
Port fault
Configuration error
Equipment interconnection fault
If this type of faults occurs, contact the maintenance personnel for the interconnected
equipment.
Fault Locating Measures
NOTE
Before locating faults, collect data o f all NEs that are poss ibly faulty, if poss ible.
Step 1 Rule out hardware faults and radio link faults with reference to section2.2 "Fault Categories"and 2.3 "Troubleshooting Radio Links."
Step 2 Check whether upstream convergence ports at the convergence nodes report equipmentalarms.
If Then
These ports report any of the Clear the alarms as instructed in "Alarms and
-
8/11/2019 Method of Fault Locating
36/70
Huawei Transport Network Maintenance Reference
(Volume 5)
RTN Microwave 2 Troubleshooting Process and Guide
Issue 01 (2011-12-30) Huawei Proprietary and Confidential
Copyright Huawei Technologies Co., Ltd.
29
If Then
following e quipment a larms:
ETH_LOS
LASER_MOD_ERR LASER_NOT_FITED
ETH_NO_FLOW
Handling Procedures" in the Maintenance Guide .
These ports do not report
equipment alarms
Go to the next step.
Step 3 Check RMON statistics about upstream convergence ports at the convergence nodes.
If Then
The ports receive data but do nottransmit data
The boards where the ports locate may be faulty. Inthis case, go to the next step.
The ports do not rece ive data The interconnected equipment is faulty. In this case,
rectify the fault by following instructions in chapter3
"Equipment Interworking Guide ".
The ports receive and transmit data Go to the next step.
Step 4 Check the Ethernet bandwidths provided by radio links at the convergence nodes.
If Then
The Ethernet bandwidths providedby radio links are insufficient
Expand capacities of the radio links to increaseEthernet bandwidths.
The Ethernet bandwidths provided
by radio links are suff icient
Go to the next step.
Step 5 Check service configurations.
1. Check Ethernet service configurations at the convergence nodes.
If Then
No Ethernet service is configured,
or source/sink ports are incorrectlyset
Re-configure Ethernet services and check whether
the services recover. If not, go to the next step.
Ethernet services are configured as
planned
Go to the next step.
-
8/11/2019 Method of Fault Locating
37/70
Huawei Transport Network Maintenance Reference
(Volume 5)
RTN Microwave 2 Troubleshooting Process and Guide
Issue 01 (2011-12-30) Huawei Proprietary and Confidential
Copyright Huawei Technologies Co., Ltd.
30
2. Check attributes of service ports at the convergence nodes.
If Then
Attributes of the service ports are
incorrectly set
Set attributes for the service ports again (including
port enabled/disabled, tag attribute, and defaultVLAN) and check whether the services recover. If
not, go to the next step.
Attributes of the service ports are
correctly set
Go to the next step.
3. Check service VLANs at the convergence nodes.
If Then
VLAN settings are inconsistent
with actual services
Re-set VLANs for the services and check whether
the services recover. If not, go to the next step.
VLAN settings are consistent with
actual services
Go to the next step.
Step 6 Reset the NEs at the convergence nodes.
NOTE
If the fault persists after all the preceding s teps are performed, dial Huawei technical s ervice hotline or
contact Huawei local office.
----End
2.5.2 Services at All Base Stations on an Entire Network or in anArea Experience Packet Loss
Fault Symptoms
Services at all base stations on an entire network or in an area experience packet loss. For
example, all Internet service users experience a low access rate, calls are delayed, pingpackets between BSCs/RNCs and base stations are lost, or art ifacts appear in video services.
Cause Analysis
If services at all base stations on an entire network or in an area experience packet loss, faults
probably occur at convergence nodes (possibly OptiX PTN 1900 or Opt iX RTN 950) that are
interconnected with BSCs/RNCs. Therefore, check for the following faults at the convergencenodes (the possibility of service configuration errors is eliminated because the services are not
interrupted):
Incorrect parameter setting (for example, mismatched working modes) for Ethernet ports
Network cable or fiber fault
Service traffic exceeding preset bandwidth
Member link fault in link aggregation groups (LAGs)
-
8/11/2019 Method of Fault Locating
38/70
Huawei Transport Network Maintenance Reference
(Volume 5)
RTN Microwave 2 Troubleshooting Process and Guide
Issue 01 (2011-12-30) Huawei Proprietary and Confidential
Copyright Huawei Technologies Co., Ltd.
31
Oversized burst traffic
Broadcast storm
Inappropriate quality of service (QoS) parameter setting
Fault Locating Measures
NOTE Before locating faults, collect data o f all NEs that are poss ibly faulty, if poss ible.
Step 1 Check whether the convergence nodes report alarms.
If Then
The convergence nodes report alarmslike ETH_LOS or experience alarm
jitters
Clear the alarms as instructed in "Alarms andHandling Procedures" in theMaintenance Guide .
If the alarms clear, check whether the fault is
rectified. If the alarms persist, go to the next step.
The convergence nodes do not reportan alarm
Go to the next step.
Step 2 At the convergence nodes, check whether the ports used for interconnection and their peerports at the interconnected equipment are consistently set.
If Then
The ports' working modes are
inconsistent with their peer ports'
working modes
Change their working modes to the same and
check whether the fault is rectified. If not, check
the next item.
The ports' physical states aredifferent from the settings
Verify fiber connections or network cableconnections at the ports. Then, enable the ports
again and check whether the fault is rectified. If
not, check the next item.
The ports' maximum transmissionunit (MTU) settings are different
from actual packet lengths
Change the value of the MTU parameter to 9600bytes and check whether the fault is rectified. If
not, check the next item.
The ports are physically normal Go to the next step.
-
8/11/2019 Method of Fault Locating
39/70
-
8/11/2019 Method of Fault Locating
40/70
Huawei Transport Network Maintenance Reference
(Volume 5)
RTN Microwave 2 Troubleshooting Process and Guide
Issue 01 (2011-12-30) Huawei Proprietary and Confidential
Copyright Huawei Technologies Co., Ltd.
33
Cause Analysis
If services at some base stations are interrupted, certain equipment on the transmission link is
faulty. To diagnose the fault, check service continuity on the link and R MON counts of
service ports, determine the fault scope, and check for the following faults at those possibly
faulty nodes:
Board hardware fault
Boards not installed
Abnormal physical ports (used for interconnection)
Service configuration error
Fault Locating Measures
NOTE
Before locating faults, collect data of all NEs that are possibly faulty, if poss ible.
Step 1 Check service continuity on each branch of the faulty link to determine the fault scope.
If Then
The services from base stations
or OptiX RTN NEs to an NE on
the faulty link are available, but
the services from the faulty linkto the NE are interrupted
The NE or its next-hop NE on the faulty link is faulty.
In this case, go to the next step.
An NE on the faulty link receives
data but does not transmit data,
or transmits data but does notreceive data
If an NE on the faulty link transmits data but does not
receive data, check the traffic counts of its next-hop
NE. Repeat this operation until you locate the NE thatdoes not transmit data. The located NE is considered a
faulty NE. Then, go to the next step.
Step 2 Check whether the faulty NE reports a larms.
If Then
The NE reports alarms Clear the alarms as instructed in "Alarms and Handling
Procedures" in theMaintenance Guide . If the alarms
clear, check whether the fault is rectified. If not, go tothe next step.
The NE does not report an alarm Go to the next step.
-
8/11/2019 Method of Fault Locating
41/70
Huawei Transport Network Maintenance Reference
(Volume 5)
RTN Microwave 2 Troubleshooting Process and Guide
Issue 01 (2011-12-30) Huawei Proprietary and Confidential
Copyright Huawei Technologies Co., Ltd.
34
Step 3 At the faulty NE, check whether the port used for interconnection and its peer port at theinterconnected equipment are consistently set.
If Then
The port's working mode isinconsistent with its peer port's
working mode
Change the working mode to the same and checkwhether the fault is rectified. If not, check the next
item.
The port's physical state is
different from the setting
Verify fiber connection or network cable connection at
the port. Then, enable the port again and check
whether the fault is rectified. If not, check the next
item.
The port's MTU setting is
different from the actual packet
length
Change the value of the MTU parameter to 9216 bytes
and check whether the fault is rectified. If not, go to
the next step.
The port is physically normal Go to the next step.
Step 4 Check service configurations at the faulty NE.
1. Check whether services are correctly configured.
If Then
The services are not configuredor are incorrectly configured
Re-configure the services and check whether theservices recover. If not, go to the next step.
The services are correctly
configured
Go to the next step.
2. Check attributes of the service ports.
If Then
Attributes of the service ports are
incorrectly set
Set attributes for the service ports again (including port
enabled/disabled, tag attribute, Layer 2/Layer 3
attribute, and default VLAN) and check whether the
services recover. If not, go to the next step.
Attributes of the service ports are
correctly set
Go to the next step.
3. Check the service VLAN. If the service VLAN is incorrectly set, re-set it.
NOTE
If the fault persists after all the preceding s teps are performed, dial Huawei technical service ho tline or
contact Huawei local office.
----End
-
8/11/2019 Method of Fault Locating
42/70
Huawei Transport Network Maintenance Reference
(Volume 5)
RTN Microwave 2 Troubleshooting Process and Guide
Issue 01 (2011-12-30) Huawei Proprietary and Confidential
Copyright Huawei Technologies Co., Ltd.
35
2.5.4 Services at Some Base Stations in an Area Experience PacketLoss
Fault SymptomsServices at some base stations in an area experience packet loss. For example, some users
experience a low Internet access rate, calls are delayed, some ping packets between a BSCand its subordinate base stations are lost, or artifacts appear in video services.
Cause Analysis
If services at some base stations experience packet loss, certain equipment on the transmission
link is faulty. To diagnose the fault, check service continuity on the link and RMON counts of
service ports, determine the fault scope, and check for the following faults at those possibly
faulty nodes:
Abnormal physical ports (used for interconnection)
Service traffic exceeding preset bandwidth
Oversized burst traffic
Broadcast storm
Inappropriate QoS parameter setting
Fault Locating Measures
NOTE
Before locating faults, collect data o f all NEs that are poss ibly faulty, if poss ible.
Step 1 Check RMON counts of ports on the faulty link, and determine the fault scope by comparingtraffic volumes at involved NEs.
If Then
The volume of traffic received byan NE is greater than the volume
of traffic transmitted by the NE
Consider the NE as a faulty NE and go to the nextstep.
The volume of traffic received byan NE is equal to the volume of
traffic transmitted by the NE, but
both volumes are too low
Check the traffic volume at the next-hop NE. Repeatthis operation until you locate the NE whose volume
of received traffic is largely different from its volume
of transmitted traffic. The located NE is considered a
faulty NE. Then, go to the next step.
-
8/11/2019 Method of Fault Locating
43/70
Huawei Transport Network Maintenance Reference
(Volume 5)
RTN Microwave 2 Troubleshooting Process and Guide
Issue 01 (2011-12-30) Huawei Proprietary and Confidential
Copyright Huawei Technologies Co., Ltd.
36
Step 2 Check whether the faulty NE reports a larms.
If Then
The NE reports alarms like
ETH_LOS or experiences alarmjitters
Clear the alarms as instructed in "Alarms and
Handling Procedures" in theMaintenance Guide . Ifthe alarms clear, check whether the fault is rectified.
If not, go to the next step.
The NE does not report an alarm Go to the next step.
Step 3 At the faulty NE, check whether the port used for interconnection and its peer port at theinterconnected equipment are consistently set.
If Then
The port's working mode isinconsistent with its peer port's
working mode
Change the working mode to the same and checkwhether the fault is rectified. If not, check the next
item.
The port's physical state isdifferent from the setting
Verify fiber connection or network cable connectionat the port. Then, enable the port again and check
whether the fault is rectified. If not, check the next
item.
The port's MTU setting is differentfrom the actual packet length
Change the value of the MTU parameter to 9600bytes and check whether the fault is rect ified. If not,
check the next item.
The port is physically normal Go to the next step.
Step 4 Check RMON counts of each port on the faulty NE.
If Then
The total volume of traffic
converged to an upstream service
port exceeds the maximumbandwidth configured for the port
Split the traffic or increase the maximum bandwidth
configured for the port. Then check whether the fault
is rectif ied. If not, check the next item.
The burst traffic volume at anupstream service port exceeds the
maximum bandwidth configuredfor the port
Enable traffic shaping for the port, and check whetherthe fault is rectified. If not, check the next item.
The traffic volume at the faulty
NE is normal
Go to the next step.
-
8/11/2019 Method of Fault Locating
44/70
Huawei Transport Network Maintenance Reference
(Volume 5)
RTN Microwave 2 Troubleshooting Process and Guide
Issue 01 (2011-12-30) Huawei Proprietary and Confidential
Copyright Huawei Technologies Co., Ltd.
37
Step 5 Check whether QoS settings are appropriate if QoS policies are configured for the faulty NE.
If Then
The rates preset for QoS control
are lower than actual boundbandwidths
Modify QoS settings.
NOTE If the fault persists after all the preceding s teps are performed, dial Huawei technical s ervice hotline or
contact Huawei local office.
----End
2.6 Troubleshooting Microwave Protection
2.6.1 Switchover Failure or Delay in Microwave 1+1 Protection
Fault Symptoms
A switchover in microwave 1+1 protection, triggered by a radio link fault or an equipment
fault, fails or is delayed.
Cause Analysis and Handling Procedure
No.
Possible Cause
Handling Procedure
Cause 1 The microwave 1+1 protection group is in theforced or lockout switching state, causing a
switchover fa ilure.
Check the current switching state andswitching records of the microwave 1+1
protection group.
Cause 2 In the microwave 1+1 protection group, both the
main and standby links are interrupted or both the
main and standby units are faulty, resulting in a
switchover fa ilure.
Check the alarms reported by boards in the
microwave 1+1 protection group, and the
current switching state of the microwave 1+1
protection group.
Cause 3 The NE is being reset or a switchover between the
main and standby system control boards just
happens, resulting in a switchover failure or adelayed switchover.
Check the alarms reported by the NE,
switchover records of the main and standby
system control boards (OptiX RTN 950/980NEs support main and standby system control
boards), and the current switching state of the
microwave 1+1 protection group.
Cause 4 An RDI-caused switchover is triggered
immediately after a switchover is complete. As the
RDI-caused switchover needs to wait for theexpiration of the wait-to-restore (WTR) timer (in
revertive mode, the waiting time is the preset WTR
time; in non-revertive mode, the waiting time is
300s), the switchover is delayed.
Check the alarms reported by the NE, and
parameter settings, current switching state,
and switching records of the microwave 1+1protection group.
-
8/11/2019 Method of Fault Locating
45/70
Huawei Transport Network Maintenance Reference
(Volume 5)
RTN Microwave 2 Troubleshooting Process and Guide
Issue 01 (2011-12-30) Huawei Proprietary and Confidential
Copyright Huawei Technologies Co., Ltd.
38
No.
Possible Cause
Handling Procedure
Cause 5 In OptiX RTN 600 V100R005/OptiX RTN 900
V100R002C02 and later versions, anti-jitter is
provided for switchovers triggered by RDIs andservice alarms, to prevent repeated microwave 1+1
protection switchovers ca used by deep and fast
fading. As a result, some switchovers are delayed.
Check the alarms reported by the NE, and the
current switching state and switching records
of the microwave 1+1 protection group.
Cause 6 The NE is incorrectly configured or installed, for
example, IFH2 and EMS6 boards on OptiX RTN
620 NEs are incorrectly connected.
Check the NE configuration and installation
according to the microwave 1+1 configuration
standards.
2.6.2 Failure to Switch to the Main Unit in Microwave 1+1
ProtectionFault Symptoms
A microwave 1+1 protection group fails to switch services back to its main unit although its
main link or unit recovers.
Cause Analysis and Handling Procedure
No.
Possible Cause
Handling Procedure
Cause 1 The microwave 1+1 protection group works in
non-revertive mode.
Check whether the revertive mode is enabled
for the microwave 1+1 protection group. Ifnot, enable it.
Cause 2 The current switching state of the microwave 1+1
protection group is RDI, so an automatic revertive
switchover cannot take place.
Check whether the current switching state of
the microwave 1+1 protection group is RDI.
If yes, manually clear the RDI state.
Cause 3 When the microwave 1+1 protection group is in
the WTR state, the microwave 1+1 protocol
detects that the main unit is faulty. As a result,revertive switchover to the main unit fails.
Check whether boards in the microwave 1+1
protection group report hardware alarms. If
yes, handle the alarms.
2.6.3 Switchover Failure or Delay in SNCP Protection
Fault Symptoms
After the working channel of a subnetwork connection protection (SNCP) protection groupbecomes faulty, an SNCP switchover fa ils or is delayed.
-
8/11/2019 Method of Fault Locating
46/70
Huawei Transport Network Maintenance Reference
(Volume 5)
RTN Microwave 2 Troubleshooting Process and Guide
Issue 01 (2011-12-30) Huawei Proprietary and Confidential
Copyright Huawei Technologies Co., Ltd.
39
Cause Analysis and Handling Procedure
No.
Possible Cause
Handling Procedure
Cause 1 The SNCP protection group is in the forced or
lockout switching state, causing a switchoverfailure.
Check the current switching state and
switching records of the SNCP protectiongroup.
Cause 2 Both the working and protection channels in the
SNCP protection group are unavailable, resultingin a switchover failure.
Check the alarms reported by boards in the
SNCP protection group, and the currentswitching state of the SNCP protection group.
Cause 3 The NE is being reset or a switchover between themain and standby system control boards just
happens, resulting in a switchover failure or a
delayed switchover.
Check the alarms reported by the NE, therecords of switchovers between the main and
standby system control boards, and the current
switching state of the SNCP protection group.
Cause 4 On an SNCP ring formed by NEs using both SDH
and Hybrid boards, some NEs use the NE softwareearlier than OptiX RTN 600 V10R005 or OptiX
RTN 900 V100R002C02, or E1_AIS insertion isdisabled for some NEs.
Find the NEs whose NE software versions are
earlier than OptiX RTN 600 V10R005 orOptiX RTN 900 V100R002C02, and the NEs
for which E1_AIS insertion is disabled.
2.7 Troubleshooting Clocks
2.7.1 Analyzing Clock Faults
Fault Symptoms
Fault
Symptom
Alarm
Impact on System
Bit errors
occur in
services.
CLK_NO_TRACE_MODE or EXT_SYNC_LOS
EXT_TIME_LOC
HARD_BAD or SYN_BAD
LTI
S1_SYN_CHANGE
SYNC_C_LOS
If a clock source is lost or its
quality deteriorates, the quality of
services tracing the clock sourceis affected. As a result, pointer
justifications occur and the bit
error rate increases.
-
8/11/2019 Method of Fault Locating
47/70
Huawei Transport Network Maintenance Reference
(Volume 5)
RTN Microwave 2 Troubleshooting Process and Guide
Issue 01 (2011-12-30) Huawei Proprietary and Confidential
Copyright Huawei Technologies Co., Ltd.
40
Possible Causes
Possible causes of clock faults are as follows:
A clock source in the system clock source priority list is lost.
All external clock sources of an NE are lost. As a result, the NE's clock enters anabnormal state.
In synchronization status message (SSM) mode, clock sources are switched so the clocksource that an NE traces is switched.
The clock source that an NE traces deteriorates.
The clock source that an external clock port traces is lost.
The system clock does not work in locked mode.
The clock source that an external time port traces is lost.
2.7.2 Handling Common Clock Alarms
The OptiX RTN equipment provides various clock alarms to help locate clock faults. When a
clock system becomes faulty, rectify the fault based on reported alarms.
EXT_SYNC_LOS
No.
Possible Cause
Handling Procedure
Cause 1 The clock input mode (2 Mbit/s
or 2 MHz) configured for an
external clock source is
different from the actual clockinput mode.
On the NMS, check whether the clock input mode configured for
the external clock source is the same as the actual clock input
mode.
If not, change the clock input mode for the external clock source.
Then, check whether the a larm clears.
Cause 2 A system control, switching,and timing board is faulty.
On the