performence alarms

37
Workshop for Performance Related Alarms 1) Performance alarms collection (identification and procedure to collect the alarms) 2) alarms to be analyse for improvement of performance(awareness regarding causes for occurrence, criteria for analysis) 3) understanding the correlations of alarms(simulteneous occurrence of various alarms) 4) action to be taken for alarms reduction(methodology and procedure for alarm reduction) 5) frequency for monitoring alarms(schedule for alarm collection and analysis) 6) important KPI can be affected by alarms(impact of various alarms on various KPI and other performance parameters) 7) different report can help for analyzing the performance(daily performance logs to be collected i.e. G.826 performance report for all MW links) As discussed in the Bharti Operational meeting held in Gurgaon me, Rishi and Gagan Bhashin will be conducting this workshop in Patna and Kolkata.

Upload: krishan-pal-singh-sengar

Post on 27-Nov-2014

2.730 views

Category:

Documents


9 download

TRANSCRIPT

Page 1: Performence Alarms

Workshop for Performance Related Alarms

1) Performance alarms collection (identification and procedure to collect the alarms)

2) alarms to be analyse for improvement of performance(awareness regarding causes for occurrence, criteria for analysis)

3) understanding the correlations of alarms(simulteneous occurrence of various alarms)

4) action to be taken for alarms reduction(methodology and procedure for alarm reduction)

5) frequency for monitoring alarms(schedule for alarm collection and analysis)

6) important KPI can be affected by alarms(impact of various alarms on various KPI and other performance parameters)

7) different report can help for analyzing the performance(daily performance logs to be collected i.e. G.826 performance report for all MW links)

As discussed in the Bharti Operational meeting held in Gurgaon me, Rishi and Gagan Bhashin will be conducting this workshop in Patna and Kolkata.

Page 2: Performence Alarms

The Alarm Reference documentation consists of alarm printouts and diagnosis reports. The alarms have been divided into four groups, and each of them is described in a dedicated piece of documentation.The alarms are numbered in ascending order as follows:Table: Alarm groups and numbers

Alarm number in:

Notices (NOTICE)

Disturbance printouts (DISTUR)

Failure printouts (ALARMS)

Diagnosis reports (DIAGN)

Base station alarms

Transmission equipment alarms

Numbers reserved for possible external alarms

switching equipment

0–799 1000–1799 2000–27993000–3799

      4000–4799

O&M equipment 800 - 899 1800–1899 2800–28993800–3899

      4800–4899

transmission equipment

900 - 999 1900–1999 2900–29993900–3999

      4900–4999

diagnosis report number

      3700–3999      

base station/transmission equipment alarms 

        7000–7999

8000–8999  

power equipment

            5000–5499

external equipment

            5500–5999

Alarm groups and numbers

Page 3: Performence Alarms

a.Type of alarm printout

b.Name of the network elementc.Remote subscriber stage (4 characters)Printed only if the object of the alarm is in a remote subscriber stage.

d.Computer sending the alarm

e.Alarm equipment typeSWITCH switching equipment O&M operation and maintenance equipment TRANSM transmission equipment POWER power equipment EXTERN external equipment Unknown equipment type is printed as ??????

f.Date and timeStart or termination time of the alarm.

g.Urgency level*** requires immediate actions ** requires actions during normal working hours * normally no actions required Unknown urgency level is printed as a question mark (?).The urgency level is output in all alarm printouts except notices (NOTICE). The urgency levels of terminated alarms are indicated by dots (.) instead of asterisks (*).

a.Printout type

b.Alarm object

b.Position coordinates of the alarm object

c.Alarm issuerThe program block issuing the alarm. If the name of the program block issuing the alarm is not available, the family identifier of the program block is output in hexadecimal form instead of the name.If the alarm is set in a preprocessor plug-in unit, the plug-in unit name and index are output in this field (for example PCU_S-1). In this case, the alarm concerns the functioning of the plug-in unit in question.

Structure of the BSC alarm

Page 4: Performence Alarms

12) Trial informationIf the network element has been divided into a traffic transmitting part and a trial part, this field displays the text TRIAL if the alarm was issued in the trial side.

13)Recovery informationWhen recovery is informed of the alarm in order to start the automatic recovery actions, this field displays *RECOV*.

14)Processing informationIf the alarm is set before the start-up of the distributed part of the alarm system, this field displays LIB. Note that this kind of alarm does not stay as an active alarm and thus there will be no cancel printout for it.

15)Consecutive numberFailure printouts (***, **, *) are numbered in ascending order. With the help of the number the operating personnel can follow the update and cancel printouts of the original failure printout.

16)Alarm numberAlarm number is an unambiguous identifier for an alarm. It is also a search index for the description of the alarm.

17)Alarm textAlarm text is a short description of the alarm.

18)Supplementary information fields

19)Supplementary textA more detailed text printed out in some alarms.

20)Alarm operating instructionsThe user defines an operating instruction with the AOA MML command for an alarm. If the instruction has been defined, then it is displayed in the alarm printout.

Cont…..

Page 5: Performence Alarms

1.Type of alarm printout

Standard alarm printout <HIST> Alarm history printout

b.Exchange

c.BCF number

d.BTS number

e.Event type

COMM communication failure QUAL quality of service PROCES processing failure EQUIPM equipment failure ENVIR environmental failure

f.Date

g.Time

h.Urgency level*** requires immediate actions ** requires actions during working hours * no actions required The urgency level is output in all alarm printouts except notices (NOTICE). The urgency levels of terminated alarms are indicated by dots (.) instead of asterisks (*).

•Printout typeALARM fault situation CANCEL fault terminated DISTUR disturbance NOTICE notice

c.TRX number

d.BTS name (15 characters)

e.Alarm object (FU/CU/LAPD/PCM/RTSL/TRE/DMR/TRU)

•State of the alarm objectENABLED The alarm object is able to provide traffic services DISABLED The alarm object is not able to provide traffic services

Structure of a BTS alarm

Page 6: Performence Alarms

14) Consecutive alarm number

15) Alarm numberAlarm number is a search index in the alarm reference documentation.

16)TextThe text is a short description of the alarm.

17) Supplementary text

In the case of alarms number range 7600...7699 this field indicates the fault reason.

18) Supplementary information fields

A maximum of 16 fields which are separated from one another by one or several spaces.

Page 7: Performence Alarms

Reasons for Alarm triggering• Tramsmission Problem

– Cable/Connector– Mux– M/W ODU/IDU– M/W fading/interference– Configuration problem

• Hardware problem at BTS.• VSWR/loose cables at BTS.• Configuration Problem in BSC. • Hardware Problem at BSC.• Configuration Problem in BTS. • Equipment failure(BSC/BTS) • Environmental failure(High Temp) • Configuration Problem in MSC.• Configuration Problem in SGSN.• Wrong RF parameters.

Page 8: Performence Alarms

How to get the Alarms in the network

• GNSC is sending a list of active alarms in the network daily.

• GNSC is preparing a daily report for alarm history of the circles and store it at the given locations. Report can be collected from there.

OSS Name IP ftp Report available for Circles Path

Bharti-Mah 10.64.2.1 Maharstra and MP m/home/nocldr/system_alarm_count

Bharti-Mum 10.115.16.129 Gujarat and Mumbai m/home/nocldr/system_alarm_count

Bharti-WB 10.133.1.1 Kolkata, Orissa and WB /home/nnocld/system_alarm_count

Bihar 10.137.16.131 Bihar m/home/nocldr/system_alarm_count

Page 9: Performence Alarms

Cont..

• Net doctor report 34 and 35 can be taken for alarms count in the circle.

• Performance report for transmission equipment.

• MML alarms logs ZEOL,ZAHO,ZEOH and ZAHP output for each NE.

• Daily MAPA also provide information about KPI degradation in cells.

Page 10: Performence Alarms

Owner for alarms

Owner:- BSS OMCRectification Responsibility:-Transmission :-RF :-BSS FLM :-BSS OMC:-Switch:-SGSN:-

Page 11: Performence Alarms

Text Alarm No. Alarm count1 BCF FAULTY 7600 179 *** BSS FLM2 BCF OPERATION DEGRADED 7601 487 ** BSS FLM3 BCF NOTIFICATION 7602 6926 * BSS FLM4 BTS FAULTY 7603 472 *** BSS FLM5 BTS OPERATION DEGRADED 7604 421 ** BSS FLM BSS OMC6 TRX FAULTY 7606 9418 ** BSS FLM BSS OMC7 TRX OPERATION DEGRADED 7607 388 ** BSS FLM BSS OMC9 OSCILLATOR ADJUSTING TEMPORARILY INTERRUPTED 7616 * BSS FLM

11 PCM FAILURE 7704 3241 ** BSS FLM Transmission12 LAPD FAILURE 7705 23691 * BSS FLM Transmission BSS OMC13 CONTINUOUS RESTARTS OF BCF/TRX 7715 * BSS FLM Transmission14 TRAFFIC CHANNEL ACTIVATION FAILURE 7725 149 ** BSS FLM BSS OMC15 CONFIGURATION OF BCF FAILED 7730 239 ** BSS FLM BSS OMC16 BTS WITH NO TRANSACTIONS 7738 155 ** BSS FLM BSS OMC17 MEAN HOLDING TIME BELOW DEFINED THRESHOLD 7743 1684 ** BSS FLM BSS OMC RF18 EXCESSIVE TCH INTERFERENCE 7744 338 ** BSS OMC RF19 CHANNEL FAILURE RATE ABOVE DEFINED THRESHOLD 7745 5052 ** BSS FLM BSS OMC RF20 CH CONGESTION IN CELL ABOVE DEFINED THRESHOLD 7746 1020 ** BSS OMC RF21 FREQUENCY ERROR 8112 817 * BSS FLM Transmission22 SUBRACK HAS MISSING UNITS 8139 1037 *** BSS FLM Transmission23 ROUTE SET UNAVAILABLE 2064 279 * BSS OMC Transmission BSS FLM24 LINK SET UNAVAILABLE 2070 148 *** BSS OMC Transmission BSS FLM25 FAILURE IN SIGNALLING LINK ACTIVATION OR RESTORATION 2072 369 ** BSS OMC Transmission BSS FLM26 FAULT RATE MONITORING 2915 6312 *** Transmission BSS FLM

27BTS AND TC UNSYNCHRONIZATION CLEAR CALLS ON A

INTERFACE 2992 1413 ** Transmission BSS OMC

28BTS AND TC UNSYNCHRONIZATION CLEAR CALLS ON ABIS

INTERFACE 2993 3549 ** Transmission BSS FLM29 NETWORK SERVICE ENTITY UNAVAILABLE 3019 248 ** Transmission BSS FLM SGSN30 NETWORK SERVICE VIRTUAL CONNECTION UNAVAILABLE 3020 232 * Transmission BSS FLM SGSN31 RECEIVED BIT ERROR RATIO (BER) > 1E-3 8099 3160 *** Transmission BSS FLM32 FAR-END ALARM 8179 970 ** Transmission BSS FLM33 BCCH IS NOT AT PREFERRED BCCH TRX 7734 140 BSS FLM BSS OMC34 BTS/WBCx: TX ANTENNA FAULTY BSS FLM35 TRANSCODER CHANNEL FAILURE 2955 41 BSS OMC

Team need to work on alarm

Major performance alarms observed in networks

Page 12: Performence Alarms

7600 BCF FAULTY :-

• Power unit is probably broken.• Oven oscillator is broken. • Temperature inside the TRX is

dangerously high.• Temperature inside the BTS is too high.• The TX output power is out of limits (over

maximum or under minimum).• RF Impacted KPI : High TCH and SDCCH

Blocking.

Page 13: Performence Alarms

7601 BCF OPERATION DEGRADED

• Cabinet I2C bus is jammed. • Incompatible unit presence has been detected in the BTS.

Instructions: • AC is off and the battery voltage is too low or battery is being

charged• Difference between the PCM and BTS frequency reference.• The TX output power has dropped at least 3 dB.• Oven oscillator control is close to its margins.• Power unit output voltage fault• Power unit input voltage fault.• No connection to power unit.• Power unit temperature is dangerously high.• Transmission unit temperature is dangerously high.• RF Impacted KPI : High TCH and SDCCH Blocking & Drops, TASR

and HOSR.

Page 14: Performence Alarms

7602 BCF NOTIFICATION

• Temperature inside the TRX is high. • External synchronisation signals disabled

Instructions: • Power unit output voltage fault. • Power unit input voltage fault. • No connection to power unit. Instructions• Mismatch between BSC/MMI configuration file

and the actual configuration.• External synchronisation signals disabled• RF Impacted KPI : TASR and HOSR.

Page 15: Performence Alarms

7604 BTS OPERATION DEGRADED

• 7606 TRX FAULTY 7607 TRX OPERATION DEGRADED

• Rx levels differ too much between main and diversity antennas

• RF Impacted KPI : High TCH and SDCCH Drops, TASR and HOSR.

Page 16: Performence Alarms

7603 BTS FAULTY

• 7606 TRX FAULTY

• RF Impacted KPI : High TCH and SDCCH Blocking.

Page 17: Performence Alarms

7606 TRX FAULTY

• Failure detected during TRX configuring. • Antenna connection faulty.• External frame clock synchronisation failed.• External frame clock missing. • External frame number synchronisation failed. • Fault in VSWR antenna monitoring.• BOI detected that connection to TRX is lost.• No connection to dual variable gain duplex unit.• There is disturbance in the serial DL bus or bus is

broken.• RF Impacted KPI : High TCH and SDCCH Blocking.

Page 18: Performence Alarms

7607 TRX OPERATION DEGRADED

• Fault in the chain between power unit and MHA. • Fault in VSWR antenna monitoring.• No connection to dual variable gain duplex unit.• TRX is unable to implement EDGE services.• IDD main/auxiliary TRX frequency band

mismatch. • Mismatch between BSC/MMI configuration

and/or IDD TRX. • RF Impacted KPI : High TCH and SDCCH

Drops, TASR and HOSR.

Page 19: Performence Alarms

7616 OSCILLATOR ADJUSTING TEMPORARILY INTERRUPTED

• Oven oscillator adjustment function interrupted.• The Abis frequency is too high or too low, or it fluctuates

so much that the master clock tune is unreliable and tends to oscillate too much. Another reason for this alarm can be that the oven adjustment scenario has tuned the oven too near the edge. When the tune is stopped, the master clock can go by itself out of specification and therefore calls may drop after several hours or days. This is a non-fatal alarm but it indicates decreased traffic capacity in the base station.

• RF Impacted KPI : High TCH and SDCCH Drops, TASR and HOSR.

Page 20: Performence Alarms

7743 MEAN HOLDING TIME BELOW DEFINED THRESHOLD

• Mean holding time on a channel is below the operator-defined minimum during the measurement period. The alarm is used to supervise the functioning of traffic channels and to detect the possible faulty channels.

• Restore the channel by first locking it out of use and then unlocking it

• minimum mean holding time for a TCH (10 s) • length of TCH supervision measurement (120 min) • channel seizure threshold value (10) • RF Impacted KPI : High TCH and SDCCH Drops, TASR

and HOSR.

Page 21: Performence Alarms

7744 EXCESSIVE TCH INTERFERENCE

• During the supervision period, the TCH time slot has suffered excessive interference in idle mode that is equal to or higher than the operator-defined alarm threshold percentage. The alarm is used to supervise the BTS traffic capacity.

• Measure the interference level on the traffic channel in question.

• TCH interference level threshold value (50 %) • excessive interference level threshold value (4) • TCH interference supervision measurement (120 min)• RF Impacted KPI : High TCH and SDCCH Drops, TASR

and HOSR.

Page 22: Performence Alarms

7745 CHANNEL FAILURE RATE ABOVE DEFINED THRESHOLD

• The rate of calls terminating in failure on a channel is above the threshold value set by the operator. The alarm is used to supervise the functioning of traffic and signaling channels, and to detect the possible faulty channels.

• Restore the channel by first locking it out of use and then unlocking it

• TCH failure rate (20 %) SCHFR • SDCCH failure rate (80 %)• Length of supervision period (60 min) • CS = channel seizure threshold value (10) • Supervision start time (08-00)• Supervision end time (18-00)• RF Impacted KPI : High TCH and SDCCH Drops and HOSR.

Page 23: Performence Alarms

7746 CH CONGESTION IN CELL ABOVE DEFINED THRESHOLD

• The percentage of rejected channel seizure requests due to congestion as opposed to all channel seizure requests in the base station is above the operator-defined alarm threshold.

• Increase the network capacity if necessary.• Channel seizure request threshold value (10) SDCCH

congestion threshold value (20 %) • TCH congestion threshold value (20 %) • length of congestion supervision period (120 min)• RF Impacted KPI : High TCH and SDCCH Blocking.

Page 24: Performence Alarms

8112 FREQUENCY ERROR

• A frequency error has been detected in the input signal. This alarm affects the provided services.

• RF Impacted KPI : High TASR and HOSR.

Page 25: Performence Alarms

8139 SUBRACK HAS MISSING UNITS

• An installation or usage error has been detected in a piece of equipment. This alarm may affect the provided services.

• RF Impacted KPI : High TCH and SDCCH Drops.

Page 26: Performence Alarms

2064 ROUTE SET UNAVAILABLE

• The signalling point cannot be reached because none of the signalling routes of the signalling route set can be used. Signalling traffic to the signalling point concerned is totally blocked.

• RF Impacted KPI :Complete service area of BSC is impacted (100% outage).

Page 27: Performence Alarms

2070 LINK SET UNAVAILABLE

• All signalling links in the signalling link set are unavailable. There is no direct connection to the partner exchange to which this link set is connected.

• There is something wrong with the data transmission connections of the links of this link set, and/or links have been blocked.

• RF Impacted KPI : Complete service area of BSC is impacted (100% outage).

Page 28: Performence Alarms

2072 FAILURE IN SIGNALLING LINK ACTIVATION OR RESTORATION

• The activation or restoration of a signalling link fails. If there are other available signalling links in the signalling link set, signalling traffic is transmitted through them. Signalling transmission capacity is, however, decreased.

• RF Impacted KPI : Complete service area of the BSC is impacted. SDCCH and TCH drops of all cells of BSC shall have an severe impact.

Page 29: Performence Alarms

2992 BTS AND TC UNSYNCHRONIZATION CLEAR CALLS

ON A INTERFACE • Calls have been cleared repeatedly on the same A interface circuit

due to BTS and transcoder unsynchronisation and the number of successive releases has exceeded the limit defined.

• This alarm only informs the user that calls are cleared due to unsatisfactory transmission or equipment failure but it does not directly indicate the faulty equipment, for example a plug-in unit.

• Block the circuit.• Check the transmission between (the BTS) the BSC and transcoder.• Check other active alarms concerning (the BTS) the BSC and

transcoder transmission or transcoder plug-in units. • RF Impacted KPI : High TCH and SDCCH Drops.

Page 30: Performence Alarms

2993 BTS AND TC UNSYNCHRONIZATION CLEAR CALLS

ON ABIS INTERFACE • Calls have been cleared three successive times on the

same Abis interface channel due to BTS and transcoder unsynchronisation.

• This alarm only informs the user that calls are cleared due to unsatisfactory transmission or equipment failure but it does not directly indicate the faulty equipment, for example a plug-in unit.

• Lock the channel.• Check the transmission between the BTS and the BSC

(and transcoder). • Check the alarms in transcoder plug-in units.• RF Impacted KPI : High TCH and SDCCH Drops,TASR

and HOSR.

Page 31: Performence Alarms

3020 NETWORK SERVICE VIRTUAL CONNECTION

UNAVAILABLE • A Network Service Virtual Connection is unavailable. It cannot be used for General

Packet Radio Service, that is, the data transfer capacity has decreased.• In FR transport mode Check first if the other end has manually taken down the

Network Service Virtual Connection. • Check the possible hardware and configuration problems mentioned below.• Check that the pcm lines carrying the Frame Relay bearer channel and Network

Service Virtual Connection are operational. • Check that the plug-in units terminating the Network Service Virtual Connection, the

AS7 in the SGSN's PAPU and the PCU in the BSC's BCSU, are operational.• Check also that the identification parameters (NS-VCI, NSEI) of the connections are

the same in both ends. • Check that the DLCI is configured right. It has to be same in both ends of a point-to-

point link. In IP transport mode (value of the first supplementary information field is 0xFFFF).

• Check first if the other end has manually taken down the Network Service Virtual Connection.

• check the possible hardware and configuration problems.• RF Impacted KPI : Complete EDGE/GPRS service unavailable on the cells defined in

the NS-VCI.

Page 32: Performence Alarms

8099 RECEIVED BIT ERROR RATIO (BER) > 1E-3

• The bit error rate has been exceeded in the input signal. This alarm affects the provided services.

• RF Impacted KPI : Complete outage of the site or severely impacts all the RF KPI if link is fluctuating.

Page 33: Performence Alarms

8179 FAR-END ALARM

Transmission equipment has received a far-end alarm. This alarm does not affect the provided services.

• RF Impacted KPI : Complete outage of the Site.

Page 34: Performence Alarms

7734 BCCH IS NOT AT PREFERRED BCCH TRX

• The BTS has at least one TRX defined as the preferred BCCH TRX but the BCCH channel is not on such a TRX so the BCCH coverage may have decreased (If ICE feature used).

• RF Impacted KPI : EDGE might get disabled.

Page 35: Performence Alarms

• 3019 NETWORK SERVICE ENTITY UNAVAILABLE ( Complete outage)

• 7704 PCM FAILURE (Complete Outage)• 7705 LAPD FAILURE (TCH and SDCCH blocking)• 7715 CONTINUOUS RESTARTS OF BCF/TRX (TCH

and SDCCH blocking & Drops )• 7725 TRAFFIC CHANNEL ACTIVATION FAILURE(TCH

Drop and Blocking)• 7730 CONFIGURATION OF BCF FAILED ( SDCCH and

TCH blocking or Complete outage)• 7738 BTS WITH NO TRANSACTIONS (• 2915 FAULT RATE MONITORING

Page 36: Performence Alarms

BTS alarm systemThe BTS alarm system receives alarm indications from the following sources in the radio network:

•BTS equipment alarms•BTS software alarms•external alarms from the BTS site•alarms concerning PCM circuits in the radio network•alarms from LAPD links in the radio network•alarms from Abis interface equipment•alarms from application programs of the BSC•transmission equipment alarms.This block diagram illustrates the structure of the BTS alarm system:

Page 37: Performence Alarms

Displaying alarmsAlarms are written in a logical file. From the logical file, they can be directed to a local printer or to Nokia NetAct. The table below shows the logical files used with alarms. For further information on the logical files used with alarms, refer to the Logical files directed to terminal devices section of Logical Files.Table: The logical files used with alarms

LOGICAL FILEExplanation

SWITCH1 Two- or three-star switching equipment alarms

TRANSM1 Two- or three-star transmission equipment alarms

POWER1 Two- or three-star power equipment alarms

EXTERN1 Two- or three-star external alarms

OPERMA1 Two- or three-star operation and maintenance equipment alarms

SWITCH2 Switching equipment alarms whose type is notice, disturbance, or one-star alarm

TRANSM2 Transmission equipment alarms whose type is notice, disturbance, or one-star alarm

POWER2 Power equipment alarms whose type is notice, disturbance, or one-star alarm

EXTERN2 External alarms whose type is one-star alarm

OPERMA2 Operation and maintenance alarms whose type is notice, disturbance, or one-star alarm

BTSALARM1 Three-star BTS alarms

BTSALARM2 Two-star BTS alarms

BTSALARM3 One-star BTS alarms

BTSALARM4 BTS notices and disturbances

Each alarm event, alarm and its cancellation, not filtered by the alarm system, is saved in a log file. This log data is called alarm history. Using the AH command group commands, you can display the history data concerning the system's alarm situation. For further information on displaying alarm history, refer to Alarm History Handling (AH ) for DX 200 alarms, and Base Transceiver Station Alarms Handling (EO ) for BTS alarms.You can either display the alarm history, or merely the active alarms, on the selected output device.The printout format is the same as in alarm printouts.It is recommended that you print out the active alarms every day. Check if they require maintenance actions.A DX 200 alarm whose object unit is not in the normal working state is normally filtered by the alarm system. When the alarm is filtered (by any means), it is neither printed out nor stored in the alarm history. However, an alarm that is filtered purely on the basis of the state of its object unit is printed out when displaying active alarms.You can print out active DX 200 alarms with the command AHO , and active BTS alarms with the command EOL .You can display the DX 200 alarm history data with the command AHP , and the BTS alarm history data with the command EOH .