failure after admission_possible reason

10
Failure After Admission: Iub Congestion 1. If a UTRAN cell has a high number of RRC/RAB establishment request failures after being admitted by Admission Control (pmNoFailedAfterAdm), then a common reason would be due to Iub Congestion. When considering the Iub interface it is important to remember that mainly RABs configured to use strict AAL2 QoS settings will be blocked at call setup by AAL2 CAC. Typically the R99 RABs (i.e. all RABs excluding HSDPA and EUL RABs) are configured to use AAL2 QoS class A or class B, with both classes configured to use a strict QoS. HSDPA and EUL will typically use AAL2 QoS class C and class D, with both classes configured to use a best effort QoS. Typically the R99 Packet Interactive RAB will be the first RAB to show signs of AAL2 congestion with a poor Packet Interactive CSSR and corresponding high pmNoFailedAfterAdm. The AAL2 Setup Success Rate statistics from the relevant RXI towards the RBS may then be investigated. This should typically be 99% and above, but if not and the counter pmUnSuccOutConnsLocal indicates that it is local rejections (on the RXI) by CAC, then there is congestion on the Iub interface.

Upload: vickyparasar163

Post on 28-Sep-2015

175 views

Category:

Documents


19 download

DESCRIPTION

assddas

TRANSCRIPT

Failure After Admission: Iub Congestion

Failure After Admission: Iub Congestion1. If a UTRAN cell has a high number of RRC/RAB establishment request failures after being admitted by Admission Control (pmNoFailedAfterAdm), then a common reason would be due to Iub Congestion. When considering the Iub interface it is important to remember that mainly RABs configured to use strict AAL2 QoS settings will be blocked at call setup by AAL2 CAC. Typically the R99 RABs (i.e. all RABs excluding HSDPA and EUL RABs) are configured to use AAL2 QoS class A or class B, with both classes configured to use a strict QoS. HSDPA and EUL will typically use AAL2 QoS class C and class D, with both classes configured to use a best effort QoS. Typically the R99 Packet Interactive RAB will be the first RAB to show signs of AAL2 congestion with a poor Packet Interactive CSSR and corresponding high pmNoFailedAfterAdm. The AAL2 Setup Success Rate statistics from the relevant RXI towards the RBS may then be investigated. This should typically be 99% and above, but if not and the counter pmUnSuccOutConnsLocal indicates that it is local rejections (on the RXI) by CAC, then there is congestion on the Iub interface.

Example: FACTS Reports showing high pmNoFailedAfterAdm (1st plot), low CSSR Packet Interactive (2nd plot), and low AAL2 Call Setup Success Rate with corresponding high pmUnSuccOutConnsLocal (3rd plot). From 2006-11-24 the problem disappears. In this case the solution was to activate Directed Retry to GSM and to change the AAL2 QoS class B traffic to use a best effort configuration thereby allowing more PS64/128 and PS64/384 users (as well as ordering a 2nd E1 to the site); note that this RBS did not have HSDPA configured therefore there was no concern about affecting the experience of HS users as described in section Considerations For HSDPA: Iub Bandwidth.

Failure After Admission: Core Transport Network Congestion2. Related to the above point (Failure After Admission: Iub Congestion) is transport network congestion in links other than the Iub e.g. RNCMGW (Iu-cs), RNCSGSN (Iu-ps) and inter-MGW links. If this is the case then the CSSR of an entire RNC(s) will deteriorate along with the AAL2 Setup Success Rate for a major link to the RNC. It would then be necessary to look at the link utilisation in order to confirm such link congestion, but that is beyond the scope of this document.

Example: FACTS Reports showing poor CSSR Speech for CTRNC1 for two days and then an improvement for the next two days (1st plot); and the corresponding AAL2 Setup Success Rate for the CTMGW1->RBMGW1 (2nd plot) and RBMGW1->CTMGW1 (3rd plot) links for the same days. The CTMGW1RBMGW1 link had a high utilisation (>80%) so the peak cell rate (PCR) for the link was increased resulting in the noticeable improvement.

Failure After Admission: Hardware Usage (Channel Elements)3. A high number of RRC/RAB setup failures after admission (pmNoFailedAfterAdm) could be due to insufficient UL or DL RBS hardware capacity i.e. too few channel elements available. The channel element capacity of an RBS may be software limited (according the software license configured for the RBS) or hardware limited (according to the TXBs and RAXBs installed in the RBS). The two parameters that control the RBS hardware admission policy are ulHwAdm and dlHwAdm. If these parameters are set to a value lower than 100% then Admission Control should block any RRC/RAB setup attempts requiring more than the available channel elements (see Admission Control: Hardware Usage); however, by default these parameters should be set to 100% in which case no hardware is reserved for handovers and Admission Control will not block RAB establishment attempts for this reason so the setup attempt fails after admission. The RBS counters pmSetupFailureSfXX in the UplinkBasebandPool (ULSETUPFAILURESSFXX) and pmSetupFailureSfXX in the DownlinkBasebandPool (DLSETUPFAILURESSFXX) indicate RL (at SF XX) setup failures due to a lack of UL and DL hardware capacity. If this is the case then a short term solution may be to reduce the traffic carried by the site (See the Traffic Offload sections). The long term solution is to upgrade the UL (RAXB) or DL (TXB) channel element capacity of the site. This may be achieved by swapping the relevant board with that of another site that has more capacity than it requires, or by sourcing a new board. Note that it is possible for these counters to increment even when there should be sufficient channel element capacity (for example due to a software bug in the software revision being used; see Failure After Admission: Other) so it is important to compare the channel element usage to the channel element capacity of the RBS to make sure that it makes sense for this to be the root of the problem.

Example: FACTS Reports showing poor CSSR Packet Interactive (1st plot); high pmNoFailedAfterAdm (2nd plot); and UL setup failures due to a lack of UL baseband hardware capacity (RAXB). Note that this RBS had 64 UL channel element capacity until 31st August when it was upgraded to 128 UL channel elements. The estimated UL CE Usage peaks above 64 channel elements even before the 31st confirming that RAXB congestion is the source of the problem, and then after the upgrade to 128 channel elements the UL CE Usage starts peaking above 100 indicating how necessary the upgrade was. The improvement to CSSR Packet Interactive and the decrease in pmNoFailedAfterAdm after the RAXB upgrade is clearly noticeable.

Failure After Admission: Other4. If none of the above reasons for a poor CSSR are apparent, then it is likely to be a more complicated problem to resolve; often relating to a software/hardware fault, or perhaps an external source of interference in the area. At the time of writing, the 3G technology is not as mature as the current 2G system (as would be expected) and hence there are still numerous improvements being implemented in every software release, along with the continued development of new, more efficient and optimised hardware generations for the various 3G nodes. The example below illustrates one such problem of this type encountered.

Example: FACTS Reports showing poor CSSR Speech with high pmNoFailedAfterAdm (1st plot); and high pmSetupFailuresSfXX indicating TXB congestion. However, the DL CE Usage is very low, seldom peaking above 6 channel elements so this doesnt make sense. After investigating numerous RBSs showing these symptoms it was established that they all had a single HS-TXB as opposed to the other RBSs which all had a TXB as well as an HS-TXB. Both configurations are valid and have more than sufficient downlink channel element capacity. It was also noted that if the RBS is restarted then the problem disappeared for a few days and then re-appeared; this is clearly visible in the plots where the restart occurred on 2 January. This turned out to be a software fault for the single TXB configuration (due to a failure to release some resources on the TXB). The fix was delivered from software release P4.0.20 (whereas the release installed on the nodes at the time was P4.0.12).