application and software products cases

Application and Software Products Cases

A&S Products Cases Table of Contents

Confidential Information of Huawei. No Spreading without Permission

i

Table of Contents

Chapter 1 FIN Cases ................................................................................................................... 1 1.1 Why SSP cannot play variable voice of duration ............................................................ 1 1.2 How to recover HP system while losing the device file ................................................... 3 How to recover HP system while losing the device file .............................................................. 3 1.3 No announcement caused by incorrect or no configuration in msglocation.................... 3 1.4 Access Database Error -952 when login to Service SMAP because of incorrect superman password.................................................................................................................... 5 1.5 After restart SCP the process is unable to start normally ............................................... 6 1.6 Execute login root TELLIN internal command failed on mmltool of SMP ....................... 8 1.7 Add OAMserver network unit on physical topology interface of I2000 client failed ........ 9 1.8 Can't Load Service In SDU Because Of Wrong Configuration ..................................... 10 1.9 On system SMAP charging audit with SSP is fail ......................................................... 11 1.10 No DBAGT Process Caused MML Login Failed ........................................................... 13

Chapter 2 WIN Cases ................................................................................................................ 15 2.1 SMP prompt error message when it start due to SCPID is inconsistent....................... 15 2.2 How to recover HP system while losing the device file ................................................. 16 How to recover HP system while losing the device file ............................................................ 16 2.3 Gprs license was expired in every 7 days because of wrong data ............................... 17 2.4 IN can't send delete command to RBT because there is no portal id was configuraed for msisdn....................................................................................................................................... 18 2.5 After restart SCP the process is unable to start normally ............................................. 19 2.6 Execute login root TELLIN internal command failed on mmltool of SMP ..................... 21 2.7 Add OAMserver network unit on physical topology interface of I2000 client failed ...... 22 2.8 MML command fail due to incorrect rentflag format...................................................... 23 2.9 Vpn websmap run unstably due to incorrect read timeout parameter .......................... 24 2.10 Database operation delay leads to tc_begin missing.................................................... 25

Chapter 3 IPCC Cases............................................................................................................... 27 3.1 Openeye can not login with a certain number............................................................... 27 3.2 OAS program auto shutdown ........................................................................................ 27 3.3 Database objects turn invalid lead Agent Software can't login, the message like this, connecting the platform fail ....................................................................................................... 28 3.4 The call is released when the UAP try play the first file ................................................ 29 3.5 IPCC QC failed to get the voice record history through the management client .......... 30 3.6 The possible cause when ICDCOMM platform plays sometimes voice successfully, sometimes not. A specific IVR problem.................................................................................... 31 3.7 IPCC can't receive part of calls with incorrect conversion of accesscode .................... 32 3.8 The resolvling process of the problem that forwarding to ivr flow failed in agent ......... 32 3.9 The status of voice circuits were fault because of the configuration of gateway type .. 33 3.10 UAP8100 can not display the original caller number of No.7 which after PRA transformed............................................................................................................................... 34

Chapter 4 CRBT Cases ............................................................................................................. 35 4.1 SMP prompt error message when it start due to SCPID is inconsistent....................... 35

IN Cases Table of Contents


ii

4.2 How to recover HP system while losing the device file ................................................. 36 How to recover HP system while losing the device file ............................................................ 36 4.3 Gprs license was expired in every 7 days because of wrong data ............................... 37 4.4 IN can't send delete command to RBT because there is no portal id was configuraed for msisdn....................................................................................................................................... 38 4.5 After restart SCP the process is unable to start normally ............................................. 39 4.6 Execute login root TELLIN internal command failed on mmltool of SMP ..................... 41 4.7 Add OAMserver network unit on physical topology interface of I2000 client failed ...... 42 4.8 MML command fail due to incorrect rentflag format...................................................... 43 4.9 Vpn websmap run unstably due to incorrect read timeout parameter .......................... 44 4.10 Database operation delay leads to tc_begin missing.................................................... 45 4.11 Sometimes hear huge noise when call a CRBT user.................................................... 46 4.12 Samba user have not enough permission cause EAS submit configuration failed....... 47 4.13 CRBT subscriber information data difference makes a SS7 signal congestion alarm in URP 48

Chapter 5 SMC/SMS GW Cases ............................................................................................... 50 5.1 One reason for Mo error code 255 in CG sm service ................................................... 50 5.2 The Messages are Delete Unexpectly .......................................................................... 51 5.3 Configuration for Working With MMBOX With Timeout and Expiry Time ..................... 52 5.4 Wrong config in clustermng.ini cause dbdaemon status become slave on both nodes 52 5.5 Customer Complains That Send Only One SM But Was Charged Many Times .......... 53 5.6 CDMA(SMPP) Error Code Statistics Result Doesn't Contain any Data........................ 54 5.7 SMSGW Got Error 8 Because Charging Problem ........................................................ 55 5.8 Incrrect Configuration Caused SMSC MNP Function Test Failed ................................ 55 5.9 How to Solve Unknown Protocol When Tracing Interface Between Map and Mtiserve 56 5.10 DBDAEMON is Disconnected With The SMSC ............................................................ 57

Chapter 6 infoX-WISG Cases ................................................................................................... 58 6.1 Network cable of SUN mini computer ce1-ce4 could not display correct IP address ... 58 6.2 The name of the fold of wapgw dual-system script is incorrect .................................... 58 6.3 New wapgw users cannot use VI to edit configuration files .......................................... 59 6.4 MMS can not be sent .................................................................................................... 59 6.5 The Oracle_listener resource's status is fault. .............................................................. 60 6.6 Database jobs fail to start.............................................................................................. 61 6.7 Processes fail to connect the database due to the owner error of oracle user............. 61 6.8 Process Constant Restart Due to Format Error in Configuration File ........................... 61 6.9 Realizing Efficient File Download in Case of Network Packet-Loss ............................. 62 6.10 The wapgw http module often exits abnormally and core file is generated .................. 63

Chapter 7 infoX-MMSC Cases.................................................................................................. 64 7.1 Configuration for Working With MMBOX With Timeout and Expiry Time ..................... 64 7.2 PPSAgent Not Running in MMSC................................................................................. 65 7.3 Incorrect Parameter Configurations for MCAS Leads to Unvvailable MCAS Services. 65 7.4 MMSC Could Not Send Dr To Local Users After Migrating To Mdsp Charging ........... 66 7.5 The Value of 18th Field of MMSC CDRs is Incorrect.................................................... 67 7.6 How to Solve The Access of MMBox Portal Becomes Slowness ................................. 68 7.7 MMSC cannot connecto to MDSP................................................................................. 70 7.8 Wrong Configuration Cause The New Subscriber Can Not Send And Receive MMS . 71

A&S Products Cases Table of Contents


iii

7.9 The Problem That Some Prepaid Subscriber Can Not Send MMS .............................. 72 7.10 Direct Push Message Submit Fail ................................................................................. 73

Chapter 8 I2000 Cases .............................................................................................................. 75 8.1 Database Usage Is Too High - Table space=DCNMTEMPDB_TBS, Client login fails. 75 8.2 Guide to expanding DB tablespace on I2000................................................................ 77 8.3 How to clear the alarm data from the I2000 database .................................................. 78 8.4 FAQ-How to screen the snmpagent process of I2000 .................................................. 79 8.5 I2000 can not add new NE after deleted the old NE(the same type)............................ 80 8.6 NorthPerf service has stop running............................................................................... 81 8.7 I2000 Current Fault Alarm Browser is not refreshing.................................................... 82 8.8 Which is the difference between SNMP and MML signaling in I2000?......................... 82 8.9 Can not create NE MT server on I2000 client ............................................................... 83 8.10 A fake CPU usage is showed by I2000......................................................................... 84

Chapter 9 USAU Cases............................................................................................................. 86 9.1 MTP Link failed because of peer end problem.............................................................. 86 9.2 EPI E1 local lose synchronization alarm in USAU alarm window ................................. 87 9.3 USAU Sinaling remote syn lose .................................................................................... 89 9.4 Abnormal connection between MEM module and SMC because module parameter slip window is open during SMC deployment.................................................................................. 90 9.5 USAU wcsu board fault alarm problem ......................................................................... 91

Chapter 10 OS Cases .................................................................................................................. 93 10.1 How to Solve The problem "Windows out of lisence" ................................................... 93 10.2 Command mstsc console not available in Windows XP SP3, Server 2008, Vista........ 94 10.3 a method to resolve password-lost problem(base on SUSE series OS) ...................... 94 10.4 FAQ-How to take snapshot in UNIX.............................................................................. 95 10.5 n2kuser User cannot Log in to SUN Operation System................................................ 95 10.6 Can't find hard disk when installing SUSE Linux on HP DL380G4............................... 96 10.7 Linux System Cannot Save Modified Time, and After Restart, the Clock Changes. .... 97

Chapter 11 DB Cases .................................................................................................................. 98 11.1 Oracle cluster on windows 2003 ................................................................................... 98 11.2 Rebooting linux cause create oracle database failed.................................................... 98 11.3 Not automatic set environment variable causes that Oracle SqlPlus fails to start ........ 99 11.4 how to solve Oracle lock issue .................................................................................... 100 11.5 Oracle Undo Tablespace increasing Rapidly .............................................................. 101 11.6 SQL 2000 Server JDBC Driver Error by a bad installation of SQL Server.................. 102

A&S Products Cases Chapter 1 FIN Cases


1

Chapter 1 FIN Cases

1.1 Why SSP cannot play variable voice of duration

Title: Why SSP cannot play variable voice of duration

ID: SE0000337543

Update time: 2008-07-25 11:15:43

Author: Wang Yifeng

Product

Family:

IN Product: FIN

Fault Type: SCP

Keywords: duration variable voice

Phenomenon

Description:

After SCP sent PC which is used to play duration, SSP feed back

"resource unavaiable."

Alarm

Information:

resource unavaiable

Cause

Analysis:

First from SCP calltrace, it can be found it doesnot have any problem. The problem may

be that there is no corresponding voice file in SSP.

[TiINAPPromptAndCollectUserInformationArg]

collectedInfo=

collectedDigits=

minimumNbOfDigits=1

maximumNbOfDigits=1

cancelDigit="*"

firstDigitTimeOut=1

interDigitTimeOut=1

errorTreatment=0

interruptableAnnInd=TRUE

voiceInformation=FALSE

voiceBack=FALSE

disconnectFromIPForbidden=TRUE

informationToSend=

inbandInfo=

messageID=



2

variableMessage=

elementaryMessageID=0x3f50007b(1062207611)

variableParts=

[0]=

duration=00 00 00 73

numberOfRepetitions=1

duration=0

----------------------------------

time : 2008- 6-18 16:01:03.672

----------------------------------

the Request Primitive: TC-INVOKE.

The code :

0 01 00 00 00 0B 00 00 00 10 00 09 00 03 00 02 00 ................

16 00 00 00 07 D9 00 00 00 00 00 00 00 00 00 55 10 ..............U.

32 00 00 07 D9 00 00 07 D9 01 0A 00 00 30 01 00 00 ............0...

48 07 08 00 40 30 3E A0 1D A0 1B 80 01 01 81 01 01 ...@0>..........

64 83 01 0B 85 01 01 86 01 01 87 01 00 88 01 FF 89 ................

80 01 00 8A 01 00 81 01 FF A2 1A A0 18 A0 10 BE 0E ................

96 80 04 3F 50 00 7B A1 06 86 04 00 00 00 73 81 01 ..?P.{.......s..

112 01 82 01 00

But this voice file "3f50007b" exists in SSP. From the SSP trace, it can be it is a dtmf not

a duration for variableParts. So checking the binary code sent by SCP, it can be found S

CP sends the wrong code. In "A1 06 86 04 00 00 00 73", type of duration should be 85 n

ot 86. 86 means dtmf.

So why scp sent wrong code? After confirm with SCP R&D, the configure IsTELLINOVS

in scusys.cfg should be configure to 1 for OVS.

Following is the type of variableParts

80 interger

81 number

82 time

83 date

84 price

85 duration

86 DTMF

Handling

Process:

Modify

the configurationTELLINOVS in scusys.cfg should be configure to 1 for OVS.



3

1.2 How to recover HP system while losing the device file

Title: How to recover HP system while losing the device file

ID: SE0000336739

Update time: 2007-09-27 05:41:24

Author: Qian Liang

Product

Family:

IN Product: FIN

Fault Type: Data Configuration

Keywords: HP Device file

Digest: OS

Phenomenon

Description:

The customer deleted the file "/dev/rmt/0m" by mistake.That caused the tape machine co

uld not be in use.

Alarm

Information:

The customer deleted the file "/dev/rmt/0m" by mistake. That caused the tape machine c

ould not be in use.

Cause

Analysis:

To solve this problem, we need refresh the device file of HP system. We have two way to

do it.

1. Carry out "insf -e" to refresh.

2. Rebuild the device file through command "mknod", but the nod number needed.

Handling

Process:

Carrying out the command "insf -e" to refresh, then the system create the device file auto

matically. and the tape machine is normal.

Suggestions

and

summary:

Similar problem like losing device file, we can solve it by refreshing the hardware device f

ile through "insf -e".

1.3 No announcement caused by incorrect or no configuration in msglocation

Title: No announcement caused by incorrect or no configuration in msglocation

ID: SE0000333390

Update time: 2008-06-26 13:45:54



4

Author: Yong Yeng Ling

Product

Family:

IN Product: FIN

Fault Type: SCP

Keywords: msglocation announcement

Digest:

Phenomenon

Description:

During FIN PPS implementation, no announcement can be played, the service release ca

ll. Will happen to all FIN platform and service version.

Alarm

Information:

Setval trace shows the following error:

[19:00:52.952][-2] ssd.nMessageIDFlag = 0,nUserChooseValue = 1

[19:00:52.952][-1] No valid data in MsgConversion[].

[19:00:52.952][-1] *** No MessageID conversion! 100728966 (HEX: 6010086) -> 100728

966 (HEX: 6010086)

[19:00:52.956][-1] ***PROMPT*** Use default sspipindex value(-1) to get device state in

getDeviceState (original sspipindex's value is 2000)

[19:00:52.956][-1] m_nrOfMsgLocation is invalid in getDestSSPIPIndex(), m_nrOfMsgLo

cation = 0 .

[19:00:52.957][4000001] *** error in /VIPguest/zcr/source/scf/itu/inap/sib/inapuisib.C, 535

7, getDestSSPIPIndex return invalid SSPIPIndex(-1)

There is not corresponding record in table(msglocation), messageID = 100728966, resou

rceType = ElementaryMessageID(0)

[19:00:52.957][4000001] decide messageID on which SSP/IP error.[inapuisib.C:499].

Cause

Analysis:

"getDestSSPIPIndex return invalid SSPIPIndex(-1)

There is not corresponding record in table(msglocation)"

No matching record can be found in msglocation.

When trying to play an announcement, the service will require to know where(SSP or AIP

etc) to send the announcement.

The service will search base on SSPIPIndex=2000 -- the SSP ID which the IDP comes fr

om, and if not found, tries SSPIPIndex=-1 (default)

This settings is to be configured in SYS SMAP > System Management Table > Standalon

e IP management -- (see attached diagram for configuration).

Handling

Process:

Configured msglocation table through System SMAP and synchronise to SCP server. Th

e service trace shows that PA and PC is now sent to SSP.

Below is a reference configuration used for F operator where their solution is for SSP to p

lay announcement.

Select * from msglocation



5

locationid 1

scpid 101

resourcetype 0

messagebegin 0

messageend 4294967295

srcsspipindex 2000

destsspipindex1 2000

destsspipindex2

relationtype1 0

relationtype2

devicestate 0

destsspipindex3

relationtype3

areano 3

Suggestions

and

summary:

None.

1.4 Access Database Error -952 when login to Service SMAP because of incorrect superman password

Title: Access Database Error -952 when login to Service SMAP because of incorrect superman password

ID: SE0000333850

Update

Time:

2008-06-26 13:44:15

Author: Saadullah sheikh s76188

Product

Family:

IN Product: FIN

Fault Type: SMAP

Keywords: 952 FIN Service SMAP

Digest:

Phenomenon

Description:

SMP is usually running on server FIN2. After switch over to FIN1, Service SMAP cannot lo

gin.

This case can be applied to all FIN service SMAP (IIN and ENIP platform version).

Alarm

Information:

Access Database error, error number: -952.

SQL error -952: User's password is not correct for the database server .sql.

Cause

Analysis:

The problem is because superman password is not defined for FIN1.

FIN Service SMAP flow is different from WIN Service SMAP.

Database user for WIN is sms user.



6

Database user for FIN is superman, sysman, servman, subman

Thus these users should be created when installing smp and password has to be defined

correctly according to installation manual.

When we login with superadm in service smap, service smap uses superman user and

password defined in operator_level to access smp database. But if the superman user in

smp is with a different password, then 952 error occurs cannot connect to database.

Handling

Process:

Superman, sysman, servman, subman password in SMP server should have the same

password as defined in smpdb operator_level:

Select * from operator_level

level username userpswd

0 superman smsqazpl

1 sysman sys,./

2 servman serv,./

3 subman sub,./

After modifying superman password in FIN1 to smsqazpl, login is successful.

Suggestions

and

Summary:

There are a lot of 952 error troubleshooting cases but are mostly for WIN stating sms

password is wrong. Please note that FIN do not use sms user to login database but

superman, sysman, servman or subman.

Each user has different authority whereas superman has the highest authority.

1.5 After restart SCP the process is unable to start normally

Title: After restart SCP the process is unable to start normally

ID: SE0000334094

Update time: 2008-07-29 11:37:44

Author: Gudo

Product

Family:

IN Product: FIN

Fault Type: SCP

Keywords: RCOMM

Digest:

Phenomenon

Description:

Restart SCP, the process is unable to start normally.

Alarm

Information:

From the manager_o.log,we can see "[08:13:27][DS0] Prompt: Starting system....

[08:13:27][DS0] Prompt: Read license file successfully



7

[08:13:27][DS0] Error: Can't read RCOMM Version in scusys.cfg.

[08:13:27][DS0] Error: Fails to initialize system. System exit."

Cause

Analysis:

The problem is very clear.The problem is scusys.cfg file configure error.Use "ls -lrt" comm

and; find a backup file for scusys.cfg in yesterday.

Use "diff scusys.cfg scusys.cfg.bak" find in the new file add three Rcomm configation."RC

OMM_FE_ID[14] = 1011

RCOMM_Version[3] = 1

RCOMM_OperationID[3] = 101

RCOMM_UniqueOperationID[3] = 101

RCOMM_ServiceKey[3] = 216

RCOMM_ControlFlag[3] = 0

RCOMM_ExecuteFE[3] = 0 //Indicators the service will be triggered in SCF or SIP

NODE

RCOMM_FE_ID[15] = 1011







NODE








NODE"

in the file RCOMM_FE_ID differ..

Handling

Process:

1. Roll back the backup file,the SCP process normally.

2. Check before and after backuo file differ.

3. Keep the RCOMM_FE_ID same.

4. Check the manager_0.log,can not find after charge suscys.cfg refresh log.

Suggestions

and

summary:

When change the configure files, especially is system files,

need refresh the memory and check the manager_0.log in order to confirm the amend is s

uccessfully.



8

1.6 Execute login root TELLIN internal command failed on mmltool of SMP

Title: Execute login root TELLIN internal command failed on mmltool of SMP

ID: SE0000334110

Update time: 2008-06-25 16:33:24

Author: Guo Pei

Product

Family:

IN Product: FIN

Fault Type: SMP

Keywords: login root TELLIN internal dbcfgtool

Digest:

Phenomenon

Description:

We want to use root user to refresh config on mmltool of SMP,it report "System internal e

rror".

Alarm

Information:

smcp2 4: start.sh mml

Starting MML ...

2008-06-12 23:11:51 MML>>> ACK:QUERY CMC SVCKEY: RETN=1001, DESC="servN

ame is empty";

2008-06-12 23:11:54 MML>>> login root TELLIN internal

2008-06-12 23:12:03 MML>>> ACK:LOGIN: RETN=7, DESC="System internal error".

Cause

Analysis:

From the error information,we should to check whether the DB agent process sms_comm

on_serv is up or not.

Handling

Process:

1.use "p" command to check sms_common_serv process on SMP,we find the process is

not exist.

2.use dbcfgtool to config commserv and nodeidtab,

smcp2 /tellin/smp/smp/sms_run/bin 9 > dbcfgtool

*********************************************************

Main Menu

1 commserv: config and mod commserv.ini

2 nodeidtab: add and mod nodeidtab

0 exit

*********************************************************

according to on-site information to finish the config for commserv and nodeidtab separate

ly.

3. Edit the sms.cfg file to delete the comment of following lines:

proc= SMP1 | sms_common_serv | 132 | -1 | -l db |



9

4.to start sms_common_serv process

>start.sh mml

MML>>> connect to cmc

MML>>>reload cfgfile:

5.after finish step4,we use p to check sms_common_serv process whether exist or not,if

exist,we can use "login root TELLIN internal" on mmltool interface.

Suggestions

and

summary:

Null

1.7 Add OAMserver network unit on physical topology interface of I2000 client failed

Title: Add OAMserver network unit on physical topology interface of I2000 client failed

ID: SE0000334111

Update time: 2008-06-25 16:33:19

Author: Guo Pei

Product

Family:

IN Product: FIN

Fault Type: OAM

Keywords: I2000 super FQHY6Q1c

Digest:

Phenomenon

Description:

When we add OAMserver node on I2000 client physical topology interface ,follow instruct

or manual use "super" as username and password ,it report

"username and password incorrect".

Alarm

Information:

None

Cause

Analysis:

If we add OAMserver node on I2000 failed,we should check the communication between

OAMserver and I2000 server firstly,then check whether OAMserver is running normal or

not and make sure the parameters of port and ip are right. In fact,"super"

is default username and password for add OAMserver node to I2000,if faid like this,the re

ason maybe somebody changed the password of super and version is not match absolut

ely for OAMserver and I2000.

We are sure all of above is no problem except version match,we find the version of OAM

server is ENIPV100R001.1Dh and the verion of I2000 is Df,for df series version of I2000,

when we put password of super user,the password will be encrypted and then send to O



10

AMserver,but for dh series version OAMserver,it not support decrypt,so it report "userna

me and password incorrect".

Handling

Process:

1. Telnet OAMserver via maintenance port.

telnet 10.76.100.155 23203

2.change the password of super user

chg oam ownpswd:oldpswd=super,newpswd=FQHY6Q1c,cfmpswd=FQHY6Q1c;

(FQHY6Q1c is encrypted password of super)

3.test it.

login:user=super,pswd=FQHY6Q1c;

4.if successful,we can user "super" as the password of super user to add OAMserver on

I2000 client.

Suggestions

and

summary:

None

1.8 Can't Load Service In SDU Because Of Wrong Configuration

Title: Can't Load Service In SDU Because Of Wrong Configuration

ID: SE0000334911

Update time: 2008-06-25 16:30:34

Author:

Product

Family:

IN Product: FIN

Fault Type: SCP

Keywords: load service sdu

Digest:

Phenomenon

Description:

When we load service in sdu,but there is error "Can't read file MPPT_SERV100R001C01

B044"

Alarm

Information:

"Can't read file MPPT_SERV100R001C01B044".

Cause

Analysis:

We load using serviceadm.So this process is related to the following points:

1.The service file's permission.

2. The sms driectory's permission.



11

3. The mode of file (ASC or BIN).

We check it and all these are ok.

And platform first will extract the service file.Sowe check the script of extracting command

"extractear.sh" that is for sip service.

We found platform will use command jar like "jar xvf $tempName 2>&1".

So we check it manually using jar.

But it promt there is no such command.So we know the root cause.

Handling

Process:

1. Install JDK in SDU.And then we can load service successfully.

2. We can copy jave directory from scu and config jave environment.

To sip service,when we install sdu,we should check install scp and sipnode.sipnode is not

necessary to start,but we must install it.

Suggestions

and

summary:

None

1.9 On system SMAP charging audit with SSP is fail

Title: On system SMAP charging audit with SSP is fail

ID: SE0000329129

Update time: 2008-06-25 16:30:34

Author: Fan Xiaofeng

Product

Family:

IN Product: FIN

Fault Type: SMAP

Keywords: Chargin audit fail

Digest:

Phenomenon

Description:

After customer reinstalled SSP BAM. On system SMAP charging audit with SSP is fail.

Alarm

Information:

When operator want's to process charging audit from system SMAP to SSP, charging aud

it result is 'Operation failure' for all tables. (see in attachment audit_fail.jpg)

Cause

Analysis:

Started the trace command on SMP server. there have found, that SSP sends error:

[][2008-04-24 16:47:13.712][tfeecheckbase.c.1748][Prompt][Run]check msg is Snd scpch

g: OP="servman",RDIR="/smp/smpsys/smp_run/charge",PWD="serv,./",FN="feefile.unl",I



12

P="172.22.23.13",NI=2,DPC="0000AC";

[][2008-04-24 16:47:13.713][tfeecheckbase.c.1770][Prompt][Run]It is hand check, now all

ssp is login ok,send report to smap: Login success

[][2008-04-24 16:47:13.713][tfeecheckbase.c.1774][Prompt][Run]

Table <- all

SSP <- all

ProcessMsg <- 6

Msg <- Login success

[][2008-04-24 16:47:13.713][hmsgfsm.h.67][Prompt][Run]start time:1209073633 713675

[][2008-04-24 16:47:14.771][hmsgfsm.h.72][Prompt][Run]end time:1209073634 771818

[][2008-04-24 16:47:14.772][hmsgfsm.c.410][Prompt][Run]WAIT TIME 1124243

[][2008-04-24 16:47:14.772][tfeecheckbase.c.1471][Prompt][Run]The current state is PR

OC_GETDATA

[][2008-04-24 16:47:14.772][tfeecheckbase.c.1623][Prompt][Run]received ssp 2000 's res

pond : 40224

[][2008-04-24 16:47:14.772][tfeecheckbase.c.1895][Prompt][Run]the return code is 40224

[][2008-04-24 16:47:14.772][tfeecheckbase.c.1899][Prompt][Run]the ssp 2000's state is

WAIT_CHECK

[][2008-04-24 16:47:14.773][tfeecheckbase.c.1900][Prompt][Run]change the ssp 2000's s

tate to CHECK_FAIL

[][2008-04-24 16:47:14.773][tfeecheckbase.c.1905][Prompt][Run]It is hand check, send re

port to smap: SSP fee data check error

[][2008-04-24 16:47:14.773][tfeecheckbase.c.1908][Error][Run]

Table <- all

SSP <- 2000

ProcessMsg <- 3

Msg <- SSP fee data check error

[][2008-04-24 16:47:14.773][tfeecheckbase.c.1911][Prompt][Run]set all the ssps OPERR

ATE_FAIL in table fee_status_ssp

(see in attachment smpkerdebug0.txt).

Handling

Process:

Have found, that return code 40224 from SSP means some problem with SSP BAM regist

ry.

Run regedit on SSP BAM and checked:

HKEY_LOCAL_MACHINE\SOFTWARE\HuaWei\CC08\System

there SndScpDataPath is D:\CC08\ScpData

(see in attachment register.jpeg)

Checked on BAM, and have found, that there is no folder d:\cc08\ScpData on BAM,

it is located in c:\cc08\ScpData.

Changed registry according actual situation to c:

After this, Charging Audit on System SMAP started to work.

Suggestions

and

When reinstall some software, don't forget to install it on the same path.

If you will not follow it, may be occure some troubles with interconnection with third part



13

summary: devices.

1.10 No DBAGT Process Caused MML Login Failed

Title: No DBAGT Process Caused MML Login Failed

ID: SE0000335257

Update time: 2008-06-25 16:28:24

Author:

Product

Family:

IN Product: FIN

Fault Type: SMP

Keywords: MML Login DBAGT

Digest:

Phenomenon

Description:

During upgrading PPT service on some site, the authentication table need to refresh. But

the MML can't be logined for system internal error and the authentication data can't be ref

reshed as following.


Starting MML ...


ame is empty";


2008-06-12 23:12:03 MML>>> ACK:LOGIN: RETN=7, DESC="System internal error";

Alarm

Information:

Null

Cause

Analysis:

1. At firstly, the error information can be found in the oamdbg.txt. From here, it is known th

e smpmml handle message failed:

[11:59:53.207][-1] [oamfeam.C 7497]the mml ack received from smp(smpmml): ACK:LOG

IN: RETN=7, DESC="System internal error";

[11:59:53.207][-1] [oamfeam.C 7539]invalid dlgControl(DLGEND)!

[11:59:53.207][-1] [oamfeam.C 2679]failed to process mml ack from SMPMML.

2. Secondly, check the smpmml's log "sms_sms_mmltool_run_20080623.log", we can fou

nd it call the process secu:

[2008-06-23 11:58:38][ Notice][0407015221] Service requesting:SECU

3. Thirdly, look over the SECU's log "sms_105_sms_secu_serv_run_20080623.log", it ca

n be found that process SECU can't get the authentication data :



14

[2008-06-23 11:58:22][Warning][0407014003] op_do errId:13 errInfo:Operation authentica

tion failed auth info not available

4. At last, to find out why SECU can't obtain the authentication data, we printed the detaile

d debug logs "sms_105_sms_secu_serv_debug.log".

From here, it can be found SECU get authentication from database by the DBAGT, but th

e message can't be sent to process DBAGT because the process DBAGT wasn't started.

[2008-06-23 11:59:53][ Error][0407013084] modlName:mech_op errId:624 errInfo:U

nable to obtain the session in the specified service, Service name:DBAGT

[2008-06-23 11:59:53][ Error][0407013201] Sending message to DBAGT failed.

Handling

Process:

1. Use tool "dbcfgtool" to config the commserv.ini and nodeidtab.

2. Get rid of the comment at the head of the line "sms_common_serv db" in the sms.cfg.

3. Restart SMP or refresh configuration with below means:

sun32 /tellin/smp/smp/sms_run/cfg 25 > start.sh mml

2008-06-23 14:12:07 MML>>> connect to cmc

2008-06-23 14:12:20 MML>>> reload cfgfile.

Suggestions

and

summary:

None

A&S Products Cases Chapter 2 WIN Cases


15

Chapter 2 WIN Cases

2.1 SMP prompt error message when it start due to SCPID is inconsistent

Title: SMP prompt error message when it start due to SCPID is inconsistent

ID: SE0000339940

Update time: 2008-07-25 11:15:43

Author: Huangxiancai

Product

Family:

IN Product: WIN

Fault Type: SMS

Keywords: SMP SCP

Digest:

Phenomenon

Description:

SMP prompt the following error message when it starts.

[][2008-07-16 09:40:24] [Error][Run]fail to get intfversion from sys_netconfig to update sdpVer

sionInfo:this node no found.

Alarm

Information:

[][2008-07-16 09:40:24] [Error][Run]fail to get intfversion from sys_netconfig to update sdpVer

sionInfo:this node no found.

Cause

Analysis:

When SMP start , it will synchronize the SCP version message with SCP.

First , SMP connect to SCP, then check the SCPID according to the inetcfg.cfg. If the SCPID i

s the consistent, SMP get the version info and update the version info in sys_netconfig table.

Handling

Process:

First, i executed ftp scp, the ftp was successful and no password was required. It meant the co

nnection between scp and smp is ok.

Then i checked the value of TELLIN_SCPID in .cshrc file in SCP, i found the SCPID is 101,b

ut the NO is 102 in inetcfg.cfg file in SMP.

SCP :

TELLIN_SCPID=101

SMP:

[SCP.1]

NO = 102

HOST = 192.168.100.101

PORT = 10001

DRMHOST = 192.168.100.101



16

DRMPORT = 10001

DBNAME = scudb

STATUS = 1

SUBTYPE = 1

MANAGETYPE = 1

VERSION = ENIPV100R001.1DH03SP03

FTPUSER = sms

FTPPSWD = 337C6AC8C440EC0F

COMMENT = SCDU

So the reason was that the SCPID was inconsistent.

I changed the TELLIN_SCPID to 102 and made it available, then restarted SMP. The proble

m was solved.

Suggestions

and summary:

Null.



ID: SE0000336739

Update

time:

2007-09-27 05:41:24

Author: Qian Liang

Product

Family:

IN Product: WIN



Digest: OS

Phenomenon

Description:


uld not be in use.

Alarm

Information:


ould not be in use.

Cause

Analysis:


do it.



17



Handling

Process:



Suggestions

and

summary:



2.3 Gprs license was expired in every 7 days because of wrong data

Title: Gprs license was expired in every 7 days because of wrong data

ID: SE0000339885

Update

time:

2008-06-26 13:45:54

Author: Abdullah Siddiquey

Product

Family:

IN Product: WIN

Fault Type: SMS

Keywords: license expired

Digest:

Phenomenon

Description:

After uploading gprs service license in SCP & SMP, we fiund that i was expired in every 7

days. After 7 days we upload it again but it expired again.

Alarm

Information:

gprs license was expired.

Cause

Analysis:

Normally, every night SMPSER should update this service license. If SMPSER failed to update it 7 days, it will expire the license. First we check whether SMPSER is running everyday or not. We found that its running everyday. Second, we check the sync_servicenumber.log file in smp and found that SMPSER is unloading this data every day but its not updating. One day data: =========== update pps_scpconfig set Argument2='551569151321053180972777119753137525' where servicekind=1040 and brandid=0 Next day data: =========== update pps_scpconfig set Argument2='551569151321053180972777119753137525' where servicekind=1040 and brandid=0



18

That means, SMPSER is getting data but not updating. Finally we check data in pps_licensecfg table in SMP and found as following: servicekind position 1011 6 1012 8 1013 11 1020 3 7200 28 There is no data for servicekind=1040(gprs service) This table is use to define which license data should be update. So, system SMPSER is not updating gprs license.

Handling

Process:

Add following line in pps_licensecfg table: 1040|25 and found that license updated on next day. So, problem has been solved. The meaning of this data is: ====================== First filed: Servicekind=1040(GPRS service license) Second filed: position=25(Multiserviceflag[25], which is used for ebanling gprs service).

Suggestions

and

summary:

We should check also pps_licensecfg table to check whether data is exist or not..

2.4 IN can't send delete command to RBT because there is no portal id was configuraed for msisdn

Title: IN can't send delete command to RBT because there is no portal id was configuraed for msisdn

ID: SE0000337729

Update Time: 2008-07-22 15:13:36


Product

Family:

IN Product: WIN

Fault Type: Service

Keywords: Failed to Request Delete RBT for msisdn:'1559000001'! code:-1

Digest:

Phenomenon

Description:

When RBT subscriber will fail to deduct his/her next monthly fee, eventagent will send

command to RBT portal to disable service. But IN was not sending that command to portal.

Alarm

Information:

Failed to Request Delete RBT for msisdn:'155900

0001'! Code:-1.

Cause

Analysis:

We open the debug flag for scpser eventagent. For the evengagent_debug.log, we found

the following error:

106 (2008-07-03 08:31:02.709)Warn: [CRBTPortal] Cannot Find PortalID for

msisdn:'1559000001'. result:-1

107 (2008-07-03 08:31:02.709)Debug: [SOCKET] Enter Close().

108 (2008-07-03 08:31:02.710)Debug: [SOCKET] Exit Close().



19

109 (2008-07-03 08:31:02.710)Warn: [CEventAgent]:ProcessEventType4() Failed to

Request Delete RBT for msisdn:'155900

0001'! code:-1

That means, system didnt get the portalID for that subscriber. We configure the number

segment and portal ID in msisdntoaip table. When query this table, data as following:

msisdnstart : 1550000000

msisdnstop : 1559000000

portalid : 1

But our number is 1559000001, so it out of renge.

Handling

Process:

Modify data of table msisdntoaip as following:



portalid : 1

After that test again, problem has been solved.

Suggestions

and

Summary:

Msisdn segment should be configured.



ID: SE0000334094

Update time: 2006-07-29 11:37:44

Author: Gudo

Product

Family:

IN Product: WIN

Fault Type: SCP

Keywords: RCOMM

Digest:

Phenomenon

Description:


Alarm

Information:







20

Cause

Analysis:


and ,WINd a backup file for scusys.cfg in yesterday.

use "diff scusys.cfg scusys.cfg.bak" WINd in the new file add three Rcomm configation."R

COMM_FE_ID[14] = 1011







NODE








NODE








NODE"


Handling

Process:




4. Check the manager_0.log,can not WINd after charge suscys.cfg refresh log.

Suggestions

and

summary:

When change the configure files ,especially is system files,need refresh the memory and c

heck the manager_0.log in order to confirm the amend is successfully.



21



ID: SE0000334110

Update time: 2008-06-25 16:33:24

Author: Guo Pei

Product

Family:

IN Product: WIN

Fault Type: SMP


Digest:

Phenomenon

Description:


rror".

Alarm

Information:


Starting MML ...


ame is empty";



Cause

Analysis:



Handling

Process:

1.use "p" command to check sms_common_serv process on SMP,we WINd the process i

s not exist.



*********************************************************

Main Menu



0 exit

*********************************************************

according to on-site information to WINish the config for commserv and nodeidtab separa

tely.





22


>start.sh mml



5.after WINish step4,we use p to check sms_common_serv process whether exist or not,

if exist,we can use "login root TELLIN internal" on mmltool interface.

Suggestions

and

summary:

Null



ID: SE0000334111

Update time: 2008-06-25 16:33:19

Author: Guo Pei

Product

Family:

IN Product: WIN

Fault Type: OAM


Digest:

Phenomenon

Description:




Alarm

Information:

None

Cause

Analysis:




is default username and password for add OAMserver node to I2000,if faid like this,the r

eason maybe somebody changed the password of super and version is not match absolu

tely for OAMserver and I2000.

We are sure all of above is no problem except version match,we WINd the version of OA

Mserver is ENIPV100R001.1Dh and the verion of I2000 is Df,for df series version of I200

0,when we put password of super user,the password will be encrypted and then send to



23

OAMserver,but for dh series version OAMserver,it not support decrypt,so it report "user

name and password incorrect".

Handling

Process:


telnet 10.76.100.155 23203




3.test it.



I2000 client.

Suggestions

and

summary:

None

2.8 MML command fail due to incorrect rentflag format

Title: MML command fail due to incorrect rentflag format

ID: SE0000334911

Update

time:

2008-06-25 16:30:34

Author: qinbin

Product

Family:

IN Product: WIN

Fault Type: SCP

Keywords: rentflag mml

Digest:

Phenomenon

Description:

Billing side complain that they encounter this problem when send MML command:

03/07 00:03:06 remain size: 0

03/07 00:03:06 MmlRequest: 00641.00 PPS 00000000DLGCON000000000000TX

BEG 0000MODI PPS SUBCOS:MSISDN=1699889363,SUBCOS=24 B2A5ACDC

03/07 00:03:06 Response: RETN=1021;Subscriber's rent service status error.

03/07 00:13:15 remain size: 0


BEG 0000MODI PPS SUBCOS:MSISDN=1698999078,SUBCOS=24 B8A5AEDC



24


Alarm

Information:

NULL

Cause

Analysis:

we query the 2 subscribers's rent service via websmap ,it also prompt rent service status

error. but,other subscriber's 'rent service can be queried correctly via websmap.

when websmap query or mml command modify subcosid,it will check what rent service th

e subscriber register ,it correspond to he basetab_pps.rentflag,we compare the rentflag fil

ed between normal and alnormal subscriber ,found the rentflag for normal subscriber is lik

e "000000000000000000000000000000000000",however the abnormal subscriber is like

"0",only one digit.

it should be the root cause .

Handling

Process:

execute this SQL clause below :

update basetab_pps set rentflag="000000000000000000000000000000000000" where m

sisdn="1698999078"

then try the websmap and mml command,it is ok.

Suggestions

and

summary:

Normally, when create new subscribers;

the rentflag in basetab_pps should have 36 digits. We cann't modify it.

2.9 Vpn websmap run unstably due to incorrect read timeout parameter

Title: Vpn websmap run unstably due to incorrect read timeout parameter

ID: SE0000329129

Update

time:

2008-06-25 16:30:34

Author: qinbin

Product

Family:

IN Product: WIN

Fault Type: SMAP

Keywords: MC read timeout

Digest:

Phenomenon

Description:

1. login in websmap often fail;

2.when operate some menu,it often prompt mml error;

3.when operate some menu,it often login out automaticall



25

Alarm

Information:

Null

Cause

Analysis:

We check the log file in websmap server:servicelog,found the following error:

[ERROR ] Create SmpSession Fail,maybe network or user password problem (SmpSessi

on.java:236)"

[WARNING] communication fail:java.net.SocketTimeoutException: Read timed out 0(SMP

Log.java:150)

for the firtst error,we already confirm the network and password is correct.

because the network performance is good for the other equipment in the same local netw

ork.

for the second error,we check the MC configuration terminal ,found the read timeout is 10,

according to document,the value shoude be set 0(it is the default value,means never expi

re)

Handling

Process:

Login MC screen,modify the value for read timeout from 10 to 0,then save;after this,the vp

n websmap is ok.

Suggestions

and

summary:

We should not modify the default value when we don't know the detailed meaning.

2.10 Database operation delay leads to tc_begin missing

Title: Database operation delay leads to tc_begin missing

ID: SE0000335257

Update

time:

2008-06-25 16:28:24

Author: Lizhe

Product

Family:

IN Product: WIN

Fault Type: SCP

Keywords: tc_begin miss

Digest:

Phenomenon

Description:

SCP will report call abandon, in manager*run:

[03:42:30][DS1] Prompt: TC_BEGIN: (receive - send) = (686879292 - 686821861) =

57431



26


57843

Alarm

Information:

Some call abandon happened at checkpoint time, but also happened at random time.

Cause

Analysis:

The reason why tc_begin missing always including:

1. the dialog raised by USAU is negative, scf will not receive tc_begin message.

2. ARI or GPRS message will buffer tc_begin first, then sent to scf.

3. FIFO overload or dynamic overload will drop tc_begin message, scf will not receive me

ssage.

4. in share loading mode, ARI message will not sent tc_begin to scf if module number err

or.

5. caps is too high leads to static overload, scf will not receive message.

reason 1 and 4 will record error message in logs.

reason 2, value of "TC_BEGIN: (receive - send)=" will increase first then decrease.

reason 5, CAPS is not very high when tc_begin missing.

so most possible reason is dynamic overload.

check scf*run there are delay for storage procedure CashRecharge.

Handling

Process:

Optimize the onconfig parameter in informix,expand LOCKS, reduce BUFFERS, LRU_MA

X_DIRTY & LRU_MIN_DIRTY.

Optimize CashRecharge procedure, modify the primary key for the table.

Testing again, tc_begin message rarely miss now.

Suggestions

and

summary:

None

A&S Products Cases Chapter 3 IPCC Cases


27

Chapter 3 IPCC Cases

3.1 Openeye can not login with a certain number

Title: call drops on SIP Trunks due to wrong configuration of SIP local port

ID: SE0000396020

Update time: 2009-06-30 20:18:21

Author: TaiZhi

Product Family:

Operation Support System Product: Call Center and CRM

Fault Type: Others

Keywords: openeye uap phone number lock

Digest:

Phenomenon Description:

One openeye can not login to UAP, but the others are all OK.

Alarm Information:

Null.

Cause Analysis:

1, Network problem 2, Openeye application error 3, Others

Handling Process:

1, Ping the IP of IFM board(for sip) , there is no lost. So network is ok. 2, Login with another phone number, it's successful. So there is no error with openeye application. 3, We find that agent closed openeye abnormally last time, and UAP would lock the certain number for about 50 to 60 minutes, we can run 'URG EDPT' on UAP to unlock the number. The problem is fixed.

Suggestions and summary:

Null.

3.2 OAS program auto shutdown

Title: OAS program auto shutdown

ID: SE0000396336

Update time: 2009-06-30 19:47:22

Author: Tang Min

Product Operation Support System Product: Call Center and CRM



28

Family:

Fault Type: Call Center and CRM

Keywords: OAS auto shutdown

Digest:


Version: OAMV2.0D313 ICD Product V300R004C01B112

Alarm Information:

OAS auto shutdown.

Cause Analysis:

1. When oas shutdown abnoraml, it will create coredump under folder corebakup. Please provide it to HQ for analysis.

2. and run follow command，provide result to HQ.

</home/zxy/oas/bin>gdb oas /home/icd/oasdir/corebakup/core_20090603_135212_oas (gdb) where 3. In environment profile .csrhc , there are some variables configured for OAS. please check whether there is setenv LD_LIBRARY_PATH /opt/jdk/javaoamlib/lib:***** 4. Because OAS use jdk's library, not use OAS self library oasdir/lib 5. We install OAS and WAS in CTI server.WAS need jdk, so that is why setenv LD_LIBRARY_PATH /opt/jdk/javaoamlib/lib:**

Handling Process:

delete /opt/jdk/javaoamlib in .cshrc

and run command ：source .cshrc

restart OAS again.


None.

3.3 Database objects turn invalid lead Agent Software can't login, the message like this,

connecting the platform fail

Title: Database objects turn invalid lead Agent Software can't login,the message like this,connecting the platform fail

ID: SE0000386456

Update time: 2009-06-30 19:31:20

Author: y00127258

Product Family:


Fault Type: ICD Service

Keywords: connecting the platform failed

Digest:

The agent login in the Agent software failed, the software pop-up this



29


message:connecting the platform failed, check the configuration of CCS.

Alarm Information:

Null

Cause Analysis:

As the indication message, there should has some faults in ICD platform or CTI. There maybe has four reasons about this fault: 1.The platform is modified by somebody. 2.The platform is controlled by resin software of Web Server,the resin software maybe has some problems. 3.The Agent software is connectted to the database according to the Application Server, the Application Server is controlled by iUAS,so the iUAS maybe has some problems. 4.If the CTIserver has problems, the agent can't connect to CCS,so there maybe has some problems in CTIserver.

Handling Process:

1.Check the CTIserver, the status of MDS,CTI-Link,MCP,CTIServer is normal, in MCP there has successful connections of Agent. So exclude the reason of CCS. 2.Check the status of Resin, we found some fault information about database like:ora-24372,invalid objects for describe. 3.Set the log level of Resin to the highest, in the log, we found many fault information about database. 4.Login the database with icd/icd(platform database),icdmain/icd(Service database) with PL/SQL software, press the button compile invalid objects, we found many invalid objects in platform and Service database. 5.Compile the invalid database objects in platform database and Service database, press the button compile invalid objects again, there is no invalid objects. 6.Relogin the Agent software, it's normal. 7.Then retrospect the operations before the fault, I modify one field of table's attribute from NOT NULLble to Nullble.And this table is related to many database objects.So the database generates many invalid objects.


After modification of database objects, including table, index,procudure,function,trigger,job,package and so on,please compile the database objects and find if there has something wrong with database objects, and don't operate the database when running service is on.

3.4 The call is released when the UAP try play the first file

Title: The call is released when the UAP try play the first file

ID: SE0000389678

Update time:

2009-06-30 19:31:07

Author: Christian Chavez Franco

Product Family:


Fault Type: Flow

Keywords: CTI Distributed Resources

Digest:

Phenomenon

In an overseas VMS site, the call is route to IVR but when the CTI order play the first file to UAP, one error happend and the call is finished.



30

Description: The management and execution site is installed in the same place. The software versions are: UAP8100 V300R003C01B053SPH002 INtess ICD V300R004C01B112SPC002 TopEng ISE uIVR V300R001C10B033CP0001

Alarm Information:

Null

Cause Analysis:

First we confirmed that network is ok between MSU boards and File server. After that we collect traces on CTI platform and send to RND for analysis.

Handling Process:

RND confirm that we need modify a parameter in CTI configuration. The parameter "Support Distributed Resources" was set to yes, we change this to NO, and the problem was solved.


Check CTI configuration according to our current situation.

3.5 IPCC QC failed to get the voice record history through the management client

Title: IPCC QC failed to get the voice record history through the management client

ID: SE0000383125

Information Type :

Troubleshooting Cases

Update Time: 2009-04-20 16:10:29

Views: 5

Author: Ouyang Fengsheng

Product Family:


Fault Type: CTI platform

Keywords: QC search&playback


QC login the Management client and want to search&Playback the voice record,the search result return with null. IPCC Plat:ICDV300R004C01B112. DB:Oracl9208 windows2003 enterprise.

Alarm Information:

2009/02/04 10:11:47 : tvAgentChange Before query NULL 2009/02/04 10:11:47 : ICD30_DoQuery: SELECT COUNT(*) FROM vwRecordInfo2 WHERE AgentID=103 AND BeginTime >=? AND BeginTime <= ? 2009/02/04 10:11:50 : tvAgentChange Before query NULL 2009/02/04 10:11:50 : ICD30_DoQuery: SELECT COUNT(*) FROM vwRecordInfo2 WHERE AgentID=111 AND BeginTime >=? AND BeginTime <= ? 2009/02/04 10:11:52 : tvAgentChange Before query NULL 2009/02/04 10:11:52 : ICD30_DoQuery: SELECT COUNT(*) FROM vwRecordInfo2 WHERE AgentID=112 AND BeginTime >=? AND BeginTime <= ? 2009/02/04 10:11:53 : tvAgentChange Before query NULL 2009/02/04 10:11:53 : ICD30_DoQuery: SELECT COUNT(*) FROM vwRecordInfo2 WHERE AgentID=114 AND BeginTime >=? AND BeginTime <= ? 2009/02/04 10:11:55 : tvAgentChange Before query NULL 2009/02/04 10:11:55 : ICD30_DoQuery: SELECT COUNT(*) FROM vwRecordInfo2 WHERE AgentID=116 AND BeginTime >=? AND BeginTime <= ? 2009/02/04 10:13:25 : tvAgentChange Before query NULL 2009/02/04 10:13:25 : ICD30_DoQuery: SELECT COUNT(*) FROM vwRecordInfo2 WHERE AgentID=116 AND BeginTime >=? AND BeginTime <= ? 2009/02/04 10:13:26 : tvAgentChange Before query NULL 2009/02/04 10:13:26 : ICD30_DoQuery: SELECT COUNT(*) FROM vwRecordInfo2



31

WHERE AgentID=116 AND BeginTime >=? AND BeginTime <= ? 2009/02/04 10:13:28 : tvAgentChange Before query NULL 2009/02/04 10:13:28 : ICD30_DoQuery: SELECT COUNT(*) FROM vwRecordInfo2 WHERE AgentID=116 AND BeginTime >=? AND BeginTime <= ?

Cause Analysis:

1.trace the aplogic vdn datasource. 2.login ipccdb with icd1 user to check the view vwrecordinfo2. 3.check the field CALLCENTERID in table trecordinfo2. 4.check the field SUBCCNO in the table tuserinfo. 5.CALLCENTERID and SUBCCNO should be the same value for ICD1 user.

Handling Process:

1.trace at the APlogic to confirm the search command already send to Service DB. 2.Manually to use the sql command to query the information from the database vwrecordinfo. 3.confirm that ICD1 user has the DB right to query the view.


To know the step to anaylse the message/command flow after Agent. do some operation through the Management client.

Attachments:

QC-can not searchand play back the

3.6 The possible cause when ICDCOMM platform plays sometimes voice successfully,

sometimes not. A specific IVR problem

Title: The possible cause when ICDCOMM platform plays sometimes voice successfully, sometimes not. A specific IVR problem

ID: SE0000258397

Information Type :


Update Time: 2009-03-18 11:16:59

Views: 11

Author: Gabriel André D. Dubois Brito

Product Family:


Fault Type:

Keywords: ICDCOMM, CCS, IVR


This problem occurred in Telemar VAS project in Brazil in IM (Interactive Media) flow when CCS tried to send information to IVR (ICDCOMM platform). The fact was that IM service sometimes was OK (we could listen to voice) but sometimes it was not OK (we couldn't listen to voice).

Alarm Information:

There were 2 alarms. 1) I2000 alarmed that one IVR was not ok and 2) The customer alarmed us that sometimes the flow was OK (they could listen to voice) but sometimes it was not OK (we couldn't listen to voice).

Cause Analysis:

The IM service uses 2 IVRs. In the CCS configuration, all flows were set to use theses 2 IVRs. The problem was that one IVR got problem (an internal process problem) and it could not be opened again by OMD. Thus, the IM service was sending flow requests to 2 IVRs but actually just 1 IVR was answering it. That's the reason that sometimes it was ok (when the flow request was sending to the good IVR) and sometimes it wasn't ok (when the flow request was sending to the bad IVR). The IVR was with a internal process problem. When we restared IVR, it is ok.

Handling Process:

We closed the IVR process manually and OMD could restart IVR application normally. After that, all flow requests were treated successfully by Huawei platform.


Null.



32

3.7 IPCC can't receive part of calls with incorrect conversion of accesscode

Title: IPCC can't receive part of calls with incorrect conversion of accesscode

ID: SE0000363616

Information Type :


Update Time: 2008-12-30 14:25:54

Views: 13

Author: Liu Xianyong (Eric)

Product Family:


Fault Type: Others

Keywords: IPCC Core Network AccessCode conversion


N country V site, I found the local mobile phone cannot get response by calling *7777#. But it suppose to be replied with a short message.

Alarm Information:

Null.

Cause Analysis:

Analyze the cause: 1. Go to check the service log in database, calls are coming into platform. But trace on the ivr by caller number, it's not coming in. That means some of calls can come in, but some can not. 2. Trace the incoming call from BAM by SIP message. The incoming accesscode is A7777*, but it suppose to be 7777. 3. Considering core network side has upgrated the msc some days before, it was possible to cause some access problems.

Handling Process:

Core network guys analyze the configurations of new msc and compare the parameters with the old one. And found out there is a parameter for changing the number from *7777# to 7777 is not set to yes. After changing the configuration, the problem is resolved.


Rember to synchroze the configurations when changing the modules. To resolve a problem, you should think about all the modules and systems inside and outside.

3.8 The resolvling process of the problem that forwarding to ivr flow failed in agent

Title: The resolvling process of the problem that forwarding to ivr flow failed in agent

ID: SE0000364104

Information Type :


Update Time: 2008-12-23 11:20:07

Views: 6

Author: Jiangzhiying

Product Family:


Fault Type: Flow

Keywords: multi-knowledge flow access code transcode forwarding ivr flow


In one ipcc demo project, we used the csp3.0 version agent, when we forwarded the incoming call to ivr flow,it displayed "Forwarding to ivr flow failed".

Alarm Information:

1"Forwarding to ivr flow failed" 2[2008-11-26 16:49:55 500] (Login name:107) TransToIVR MediaType:5 TransType:0 AccessCode: [2008-11-26 16:49:55 546] (Login name:107) Error: { CCCRedirectToAutoFailure } [2008-11-26 16:49:55 625] (Login name:107) Error: { CCCOnError } ErrorCode:11



33

[2008-11-26 16:49:55 625] (Login name:107) TransToIVR MediaType A failure is returned when the platform function is invoked:5 TransType:0 AccessCode:;Returned error code:11 [2008-11-26 16:49:55 640] (Login name:107) GetLastErrorCode :11 [2008-11-26 16:49:55 640] (Login name:107) GetPromptByErrorCode ErrorCode:11 Value: [2008-11-26 16:49:55 640] (Login name:107) GetPromptByErrorCode ErrDesc:Routing failed. [2008-11-26 16:49:55 640] (Login name:107) GetLastError ErrDesc:Initiating APC card succeeded. [2008-11-26 16:49:55 671] (Login name:107) Forwarding to the auto service 4-10091 failed

Cause Analysis:

1.Maybe the flow 10091 have problem. 2.Maybe the configuration of "T_scdtransfertab" is not correct. 3.Maybe the configuration of CTI platform has problem.

Handling Process:

1.From the error log, we found the information "Returned error code:11", because if we enabled the "forward to multi-knowledge point" function in t_pub_sysparamter table, it will result this error, so we need to disable the "forward to multi-knowledge point" function. 2.Modify the t_pub_sysparamter table, the configuration of 197th item control "forward to multi-knowledge point" function. Changed the value of it to "N". 3.Relogin the agent, test again, it still displayed "Forwarding to ivr flow failed". 4.Because this time the error code is not"11", is "1', it showed that the modification we have made is efficient, but the system has other configuration problem. 5.We checked the configuration of CTI platform again, found that the flow access code is not correct, it is 1010, "1010" is transcode, in fact, flow access code should be "called number+transcode", called number is' 10091', so flow access code should be '100911010'. 6.We changed the flow access code to "100911010", test the service agagin, it displayed"Forwarding to ivr flow succeded!".


In this case , it has two problems, one is the system enabled the muti-knowledge point function, another one is the flow access code is not correct, so we should resolve them one bye one according to the error log.

3.9 The status of voice circuits were fault because of the configuration of gateway type

Title: The status of voice circuits were fault because of the configuration of gateway type ID: SE0000334411

Information Type :


Update Time: 2008-06-27 14:51:39

Views: 3

Author: Cen Jinguang

Product Family:


Fault Type: Trunk/Signal

Keywords: Gateway type , URP


All the configuration of URP has been finished, and the signaling was showing running normally. Checking from the MGW, the status of the E1 port which connected to MSC was ok. The connection checked by the command of DSP MGW: EID="100.100.100.3:2944", it showed the connection was normal. But the the status of voice circuits were fault.

Alarm Information:

No warning information about the correlative hardware and configuration.

Cause Analysis:

It should be the configuration issue since the hardware was found no problem when it had been checked, it maybe the relay type in the command of Add TDMIU was set to the wrong value, or the wrong configuration for termination ID, or others correlative configuration.



34

Handling Process:

Checked the value of the relay type with LST TDMIU in MGW command windows, all the trunk circuits were set to EXTERN, it is OK. Checked the value of the termination ID which was set in the command Add TDMIU (in MGW) and ADD N7TKC (MGC), it was start from 0 , and the were matched between MGW and MGC, this configuration was OK Checked the configuration of trunk group and route setting and sub-route setting. All these configurations were OK. At last, the question was focus on the setting of connection between MGW and MGC, Finally, the problem was found in the configuration for MGW, the gateway type was set to TRUNK Gateway while used the command ADD MGW, It should be Universal Media Gateway. After changing the gateway's type to Universal Media Gateway, the status of the voice circuits are free while checking by the command DSP N7C.


Null.

3.10 UAP8100 can not display the original caller number of No.7 which after PRA

transformed

Title: UAP8100 can not display the original caller number of No.7 which after PRA transformed

ID: SE0000333595

Information Type :


Update Time: 2008-06-26 13:45:31

Views: 15

Author: Zhao Shouyun

Product Family:


Fault Type: Others

Keywords: can not display original caller number


In one IPCC project, we use AIP to transform PRA to No.7 and send to UAP. From AIP we can see that AIP has already send out the original caller number, but can not display in UAP side, just display the default caller number.

Alarm Information:

Null.

Cause Analysis:

The relation between network-detect-sign and switch is 3-bit10. Only when the network-detect-sign of PRA signal link is 'No' and the switch internal parameter is 3-bit10 equal to '0' that we can transfer the original caller number. But now we just find the network-detect-sign of PRA signal link is 'Yes', So we must modify it.

Handling Process:

Modify the network-detect-sign of PRA signal link to 'No', and the switch internal parameter to '0', using this command: ADD PRALNK: MN=1, SLN=96, SCN=16, NCF=FALSE; /Add link MOD SFP: VAL=FBFF;


Null.

A&S Products Cases Chapter 4 CRBT Cases


35

Chapter 4 CRBT Cases

4.1 SMP prompt error message when it start due to SCPID is inconsistent

Title: SMP prompt error message when it start due to SCPID is inconsistent

ID: SE0000339940

Update time: 2008-07-25 11:15:43

Author: Huangxiancai

Product

Family:

IN Product: WIN

Fault Type: SMS

Keywords: SMP SCP

Digest:

Phenomenon

Description:

SMP prompt the following error message when it starts.

[][2008-07-16 09:40:24] [Error][Run]fail to get intfversion from sys_netconfig to update sdpV

ersionInfo:this node no found.

Alarm

Information:

[][2008-07-16 09:40:24] [Error][Run]fail to get intfversion from sys_netconfig to update sdpV

ersionInfo:this node no found.

Cause

Analysis:

When SMP start , it will synchronize the SCP version message with SCP.

First , SMP connect to SCP, then check the SCPID according to the inetcfg.cfg. If the SCPI

D is the consistent, SMP get the version info and update the version info in sys_netconfig t

able.

Handling

Process:

First, i executed ftp scp, the ftp was successful and no password was required. It meant the

connection between scp and smp is ok.

Then i checked the value of TELLIN_SCPID in .cshrc file in SCP, i found the SCPID is 10

1,but the NO is 102 in inetcfg.cfg file in SMP.

SCP :

TELLIN_SCPID=101

SMP:

[SCP.1]

NO = 102

HOST = 192.168.100.101

PORT = 10001

DRMHOST = 192.168.100.101

DRMPORT = 10001



36

DBNAME = scudb

STATUS = 1

SUBTYPE = 1

MANAGETYPE = 1

VERSION = ENIPV100R001.1DH03SP03

FTPUSER = sms

FTPPSWD = 337C6AC8C440EC0F

COMMENT = SCDU

So the reason was that the SCPID was inconsistent.

I changed the TELLIN_SCPID to 102 and made it available, then restarted SMP. The probl

em was solved.

Suggestions

and summary:

Null.



ID: SE0000336739

Update time: 2007-09-27 05:41:24

Author: Qian Liang

Product

Family:

IN Product: WIN



Digest: OS

Phenomenon

Description:


uld not be in use.

Alarm

Information:


ould not be in use.

Cause

Analysis:


do it.





37

Handling

Process:



Suggestions

and

summary:



4.3 Gprs license was expired in every 7 days because of wrong data

Title: Gprs license was expired in every 7 days because of wrong data

ID: SE0000339885

Update time: 2008-06-26 13:45:54


Product

Family:

IN Product: WIN

Fault Type: SMS

Keywords: license expired

Digest:

Phenomenon

Description:

After uploading gprs service license in SCP & SMP, we fiund that i was expired in every 7

days. After 7 days we upload it again but it expired again.

Alarm

Information:

gprs license was expired.

Cause

Analysis:

Normally, every night SMPSER should update this service license. If SMPSER failed to update it 7 days, it will expire the license. First we check whether SMPSER is running everyday or not. We found that its running everyday. Second, we check the sync_servicenumber.log file in smp and found that SMPSER is unloading this data every day but its not updating. One day data: =========== update pps_scpconfig set Argument2='551569151321053180972777119753137525' where servicekind=1040 and brandid=0 Next day data: =========== update pps_scpconfig set Argument2='551569151321053180972777119753137525' where servicekind=1040 and brandid=0 That means, SMPSER is getting data but not updating. Finally we check data in pps_licensecfg table in SMP and found as following: servicekind position 1011 6 1012 8 1013 11



38

1020 3 7200 28 There is no data for servicekind=1040(gprs service) This table is use to define which license data should be update. So, system SMPSER is not updating gprs license.

Handling

Process:

Add following line in pps_licensecfg table: 1040|25 and found that license updated on next day. So, problem has been solved. The meaning of this data is: ====================== First filed: Servicekind=1040(GPRS service license) Second filed: position=25(Multiserviceflag[25], which is used for ebanling gprs service).

Suggestions

and

summary:

We should check also pps_licensecfg table to check whether data is exist or not..

4.4 IN can't send delete command to RBT because there is no portal id was configuraed for msisdn

Title: IN can't send delete command to RBT because there is no portal id was configuraed for msisdn

ID: SE0000337729

Update Time: 2008-07-22 15:13:36


Product

Family:

IN Product: WIN

Fault Type: Service

Keywords: Failed to Request Delete RBT for msisdn:'1559000001'! code:-1

Digest:

Phenomenon

Description:

When RBT subscriber will fail to deduct his/her next monthly fee, eventagent will send

command to RBT portal to disable service. But IN was not sending that command to portal.

Alarm

Information:

Failed to Request Delete RBT for msisdn:'155900

0001'! Code:-1.

Cause

Analysis:

We open the debug flag for scpser eventagent. For the evengagent_debug.log, we found

the following error:

106 (2008-07-03 08:31:02.709)Warn: [CRBTPortal] Cannot Find PortalID for

msisdn:'1559000001'. result:-1

107 (2008-07-03 08:31:02.709)Debug: [SOCKET] Enter Close().

108 (2008-07-03 08:31:02.710)Debug: [SOCKET] Exit Close().

109 (2008-07-03 08:31:02.710)Warn: [CEventAgent]:ProcessEventType4() Failed to

Request Delete RBT for msisdn:'155900

0001'! code:-1

That means, system didnt get the portalID for that subscriber. We configure the number

segment and portal ID in msisdntoaip table. When query this table, data as following:



39



portalid : 1

But our number is 1559000001, so it out of renge.

Handling

Process:

Modify data of table msisdntoaip as following:



portalid : 1

After that test again, problem has been solved.

Suggestions

and

Summary:

Msisdn segment should be configured.



ID: SE0000334094

Update time: 2006-07-29 11:37:44

Author: Gudo

Product

Family:

IN Product: WIN

Fault Type: SCP

Keywords: RCOMM

Digest:

Phenomenon

Description:


Alarm

Information:





Cause

Analysis:


and ,WINd a backup file for scusys.cfg in yesterday.

use "diff scusys.cfg scusys.cfg.bak" WINd in the new file add three Rcomm configation."R

COMM_FE_ID[14] = 1011




40






NODE








NODE








NODE"


Handling

Process:




4. Check the manager_0.log,can not WINd after charge suscys.cfg refresh log.

Suggestions

and

summary:

When change the configure files ,especially is system files,need refresh the memory and c

heck the manager_0.log in order to confirm the amend is successfully.



ID: SE0000334110

Update time: 2008-06-25 16:33:24

Author: Guo Pei



41

Product

Family:

IN Product: WIN

Fault Type: SMP


Digest:

Phenomenon

Description:


rror".

Alarm

Information:


Starting MML ...


ame is empty";



Cause

Analysis:



Handling

Process:

1.use "p" command to check sms_common_serv process on SMP,we WINd the process i

s not exist.



*********************************************************

Main Menu



0 exit

*********************************************************

according to on-site information to WINish the config for commserv and nodeidtab separa

tely.




>start.sh mml



5.after WINish step4,we use p to check sms_common_serv process whether exist or not,

if exist,we can use "login root TELLIN internal" on mmltool interface.

Suggestions

and

Null



42

summary:



ID: SE0000334111

Update time: 2008-06-25 16:33:19

Author: Guo Pei

Product

Family:

IN Product: WIN

Fault Type: OAM


Digest:

Phenomenon

Description:




Alarm

Information:

None

Cause

Analysis:




is default username and password for add OAMserver node to I2000,if faid like this,the r

eason maybe somebody changed the password of super and version is not match absolu

tely for OAMserver and I2000.

We are sure all of above is no problem except version match,we WINd the version of OA

Mserver is ENIPV100R001.1Dh and the verion of I2000 is Df,for df series version of I200

0,when we put password of super user,the password will be encrypted and then send to

OAMserver,but for dh series version OAMserver,it not support decrypt,so it report "user

name and password incorrect".

Handling

Process:


telnet 10.76.100.155 23203




3.test it.



43



I2000 client.

Suggestions

and

summary:

None

4.8 MML command fail due to incorrect rentflag format

Title: MML command fail due to incorrect rentflag format

ID: SE0000334911

Update time: 2008-06-25 16:30:34

Author: qinbin

Product

Family:

IN Product: WIN

Fault Type: SCP

Keywords: rentflag mml

Digest:

Phenomenon

Description:

Billing side complain that they encounter this problem when send MML command:

03/07 00:03:06 remain size: 0


BEG 0000MODI PPS SUBCOS:MSISDN=1699889363,SUBCOS=24 B2A5ACDC


03/07 00:13:15 remain size: 0


BEG 0000MODI PPS SUBCOS:MSISDN=1698999078,SUBCOS=24 B8A5AEDC


Alarm

Information:

NULL

Cause

Analysis:

we query the 2 subscribers's rent service via websmap ,it also prompt rent service status

error. but,other subscriber's 'rent service can be queried correctly via websmap.

when websmap query or mml command modify subcosid,it will check what rent service th

e subscriber register ,it correspond to he basetab_pps.rentflag,we compare the rentflag fil

ed between normal and alnormal subscriber ,found the rentflag for normal subscriber is lik

e "000000000000000000000000000000000000",however the abnormal subscriber is like



44

"0",only one digit.

it should be the root cause .

Handling

Process:

execute this SQL clause below :

update basetab_pps set rentflag="000000000000000000000000000000000000" where m

sisdn="1698999078"

then try the websmap and mml command,it is ok.

Suggestions

and

summary:

Normally, when create new subscribers;

the rentflag in basetab_pps should have 36 digits. We cann't modify it.

4.9 Vpn websmap run unstably due to incorrect read timeout parameter

Title: Vpn websmap run unstably due to incorrect read timeout parameter

ID: SE0000329129

Update time: 2008-06-25 16:30:34

Author: qinbin

Product

Family:

IN Product: WIN

Fault Type: SMAP

Keywords: MC read timeout

Digest:

Phenomenon

Description:

1. login in websmap often fail;

2.when operate some menu,it often prompt mml error;

3.when operate some menu,it often login out automaticall

Alarm

Information:

Null

Cause

Analysis:

We check the log file in websmap server:servicelog,found the following error:

[ERROR ] Create SmpSession Fail,maybe network or user password problem (SmpSessi

on.java:236)"

[WARNING] communication fail:java.net.SocketTimeoutException: Read timed out 0(SMP

Log.java:150)

for the firtst error,we already confirm the network and password is correct.

because the network performance is good for the other equipment in the same local netw

ork.

for the second error,we check the MC configuration terminal ,found the read timeout is 10,



45

according to document,the value shoude be set 0(it is the default value,means never expi

re)

Handling

Process:

Login MC screen,modify the value for read timeout from 10 to 0,then save;after this,the vp

n websmap is ok.

Suggestions

and

summary:

We should not modify the default value when we don't know the detailed meaning.

4.10 Database operation delay leads to tc_begin missing

Title: Database operation delay leads to tc_begin missing

ID: SE0000335257

Update time: 2008-06-25 16:28:24

Author: Lizhe

Product

Family:

IN Product: WIN

Fault Type: SCP

Keywords: tc_begin miss

Digest:

Phenomenon

Description:

SCP will report call abandon, in manager*run:


57431


57843

Alarm

Information:

Some call abandon happened at checkpoint time, but also happened at random time.

Cause

Analysis:

The reason why tc_begin missing always including:

1. the dialog raised by USAU is negative, scf will not receive tc_begin message.

2. ARI or GPRS message will buffer tc_begin first, then sent to scf.

3. FIFO overload or dynamic overload will drop tc_begin message, scf will not receive me

ssage.

4. in share loading mode, ARI message will not sent tc_begin to scf if module number err

or.

5. caps is too high leads to static overload, scf will not receive message.



46

reason 1 and 4 will record error message in logs.

reason 2, value of "TC_BEGIN: (receive - send)=" will increase first then decrease.

reason 5, CAPS is not very high when tc_begin missing.

so most possible reason is dynamic overload.

check scf*run there are delay for storage procedure CashRecharge.

Handling

Process:

Optimize the onconfig parameter in informix,expand LOCKS, reduce BUFFERS, LRU_MA

X_DIRTY & LRU_MIN_DIRTY.

Optimize CashRecharge procedure, modify the primary key for the table.

Testing again, tc_begin message rarely miss now.

Suggestions

and

summary:

None

4.11 Sometimes hear huge noise when call a CRBT user

Title: Sometimes hear huge noise when call a CRBT user ID: SE0000397765

Information Type :


Update Time: 2009-06-30 20:10:42

Views: 2

Author: Do Tuan Anh

Product Family:

Operation Support System Product: IVAS

Fault Type: Others

Keywords: VRB noise


When make a call to the CRBT user, sometimes can't not hear clearly, almost hear noise. The trace message in this case is normal

Alarm Information:

Null.

Cause Analysis:

The noise can be made by some reasons: the song file not good The hardware failed

The trunk cables are installed near the power cable of cabinet (in both MSC side and CRBT side) VRB board, ERI and E32 board plays failed.

Handling Process:

1. Check the song file if it not good then replace this file. If it is good, move to next step.2. Check the hardware and alarm on MGC and MGW. If it is normal, move to next step.3. Make call and trace on MGC. Compare the message in the case of hearing noise and in the normal case. If it is the same, the root cause is VRB board or ERI, E32 not good. 4. Block all workstation, open one by one workstation and test. You can find some workstation play noise. 5. If the workstations belong to 1 or 2 VRB module, pls replace the new VRB board or ERI board and test again. Normally, the problem can be solved.



47

6. If the workstations belong to 4 VRB module which is in side by side and connect to 1 E32 in MGW. Pls replace the E32 board and test again.


In summary, this kind of problem normally dues to VRB, ERI or E32 that are not good. Replace the board can solve this problem.

4.12 Samba user have not enough permission cause EAS submit configuration failed

Title: Samba user have not enough permission cause EAS submit configuration failed

ID: SE0000399837

Information Type :


Update Time:

2009-07-01 09:50:00

Views: 0

Author: Ai Yibo/94209

Product Family:


Fault Type: Service

Keywords: EAS access submit samba permission


In S country new CRBT site deployment, On-site engineer installation Eas of portal and EAS of usdp in same desktop computer. And the usdp and webportal is installed in same server. (but diferrent user)usdp EAS submit configuration successfully ,but webportal EAS submitting configuration is failed always CRBT:CRBTV600R001C01B023SPC001 CRBT EAS:EAS V100R001C01B063CP0001 USDP:USDP V100R002C01B044SPC004-doc USDP EAS:EAS V100R001C01B072 OS :suse 9 sp03

Alarm Information:

When click "save to " button, will get error message :"save configuration to CRBT1 error ,please check out the file access to portal".

Cause Analysis:

We add usdp account and webportal account to /etc/samba/smbpasswd by command "pdbedit -a username". but we can use only one account(e.g usdp) to login in a computer when MAP the share disk. But usdp user can not have permission to access the folder of webportal .so webportal EAS submitting configruation is failed.

Handling Process:

We add root account to /etc/samba/smbpasswd by command "pdbedit -a username".And we can use root account to login in the computer when MAP the share disk for EAS. Like this, we can you EAS of portal and EAS of USDP in one computer.


When install EAS for portal or usdp, Samba user should have enough permission to access poratl or usdp home folder.

4.13 CRBT subscriber information data difference makes a SS7 signal congestion alarm in URP

Title: CRBT subscriber information data difference makes a SS7 signal congestion alarm in URP

ID: SE0000397471

Information Troubleshooting Cases



48

Type :

Update Time:

2009-06-30 20:07:25

Views: 0

Author: Chen Zuihong

Product Family:


Fault Type: Service

Keywords: IAM, signal congestion, HLR


In one oversea CRBT site, we get the complaint from core network, they said that they found a SS7 signal congestion from CRBT office by their SS7 monitor system. The SS7 congestion is that 22% IAM receives RCL reply directly, only 78% ACM is replied.

Alarm Information:

Null.

Cause Analysis:

(1)We open the URP performance management software and create several tasks about signal links and calling record and CPU and Incoming office to try to get the SS7 congestion information. From the task result, we find that successful connection is only 78%, the failure is 22%.It is the same with SS7 signal congestion from the core networks. It is proved that the congestion exists. (2)We use several CRBT phone numbers to dial to test for 50 times and trace the MTP3 message. However, the MTP3 message is normal every time testing. It is very different from the signal congestion ratio. (3)It is difficult to test by the known CRBT test phone numbers. We doubt that maybe the CRBT data information is different between HLR and CRBT platform. (4)We ask the HQ technical team and get the answer: If there are more CRBT setting subscribers in HLR but less CRBT subscribers in CRBT platform. And in clconfigtab table, when bPlayDuduForExcption=0, dialing a CRBT subscriber which has CRBT setting in HLR but does not have valid data in CRBT platform, then CRBT receives IAM,CRBT platform will release the IAM directly, it will not reply ACM. This is the possible reason. (5)If it is needed to reply ACM for the (4) situation,the bPlayDuduForExcption should be 1 and dudu.wav file should exist in file server folder audio/rp/use.

Handling Process:

(1)We check the bPlayDuduForExcption,it is 0.So,we use the below SQL to modify it: update clconfigtab set bPlayDuduForExcption=1; commit; (2)We restart the UI one by one. (3)We run the tasks in URP performance system again. One hour later, the tasks results show that the signal congestion disappears. (4)We and the customers compare the CRBT information between HLR and CRBT platform and modify them to be the same. The problem is solved in the end.


In some CRBT site, sometimes, CRBT subscribers register successfully in HLR but fail in CRBT platform for the poor networks QoS. Little by little, a small difference exists between HLR and CRBT platform for CRBT subscribers. We need to consider it for signal and calling flow trouble shootings.

A&S Products Cases Chapter 5 SMC/SMS GW Cases


49

Chapter 5 SMC/SMS GW Cases

5.1 One reason for Mo error code 255 in CG sm service

Title: One reason for Mo error code 255 in CG sm service

ID: SE0000399755

Update time: 2009-06-30 20:34:15

Author: 2

Product Family:

Data Service Product: SMC

Fault Type:

Keywords: ForwardRespBeforeDeliver 255 MtSubmit

Digest:


CGSMC Version:V300R002.2D9 Carrier A is a GSM carrier, and B is CDMA network. Send message from A's gsm mobile to B's cdma mobile, the A's smc will send MtSubmit to B's smc, and B's smc will deliver the message to cdma network. In trace of Gateway_G of B, there are a lot of MtSubmit_error and error code is 0xFF(255)

Alarm Information:

Null.

Cause Analysis:

It is because the ForwardRespBeforeDeliver in cdmaconfig.ini The meaning of ForwardRespBeforeDeliver is : Whether this SMS system gives a response to or delivers the uplink MT after receiving it from other GSM SMS systems. NOTE When the value is set to 1, the delivery result received on the peer end cannot reflect the final result of the message delivery. Details: Value range: 0–1 0: Attempt to deliver a message, and then return a success or failure response according to the delivery result. This SMS system does not store SMs. 1: Return a response, and then attempt to deliver the SM. The SMS system returns the response to the peer end first, and then delivers the SM. In case of failure, the SMS system will deliver it again. Default value: 0

Handling Process:

In B's cdmaconfig.ini ForwardRespBeforeDeliver = 0 So that smc will first try to deliver the message, and then return response according to the deliver result. And the error code 0xff(255) is because the destination mobile phones are switched off.



50


Null.

5.2 The Messages are Delete Unexpectly

Title: The Messages are Delete Unexpectly

ID: SE0000396611

Update time:

2009-06-29 11:55:56

Author: w38314

Product Family:


Fault Type: Others

Keywords: delete messages

Digest:


The customer complained that the messages are delete unexpectly. The SMSC delete the messages not following the "Schedule Strategy".

Alarm Information:

The on-site engineer helped to export all the messages from database with the sql clause "exp smsc/oracle@ora92 query=\"where error_code!=0\" tables=sm_histable0517". (It is an oracle database)

Cause Analysis:

Generally ,it's the basic function to store the messages in memory to redeliver for SMSC. The customer is very worry about this problem. But the Huawei SMSC should not meet this problem for we have deployed the product all over the world.

Handling Process:

There are 4 items to control the SMSC to delete the normal messages: 1: in the smscconfig.ini, there is a global validate time configuration 2: in the “Service Attribute Management” table, you can configure the validate time for service provider 3: In the “Schedule Strategy”, you can configure it. 4: In the messages it self, there is a “validate period” parameter which is optional. So after the SMSC delivers the messages for the first time, it compare the timeout value in "Service Attribute Management(3600 seconds timeout)" and the one schedule Strategy "Default Strategy(1800+1800+72000 seconds timeout)", it found the messages will be timeout while the SMSC tries to deliver the messages for the second time(it will reach 3600 seconds). So the SMSC delete the messages.


None.



51

5.3 Configuration for Working With MMBOX With Timeout and Expiry Time

Title: CMAPServer Returned Many Error Codes of 34 ID: SE0000399136 Update Time: 2009-06-29 11:56:03

Author: a00710630

Product Family:


Fault Type: Shortmessage Center

Keywords: CMAPServer returned many error codes of 34.

Digest:


This is a CDMA NGN/WLL Network.As I query database, I always find out that there are many 'CMAP error code 34 ' returned by CMAP server, this error code( 34) means 'system error'.

Alarm Information:

There are many 'CMAP error code 34 ' returned by CMAPServer, the error code 34 means 'system error' . the service STAT in the cmap layer is affected .

Cause Analysis:

In the CMAP server , there is only one error code '0x8' which stand for, the end user is power off or out of service area .It is one of the function of SMCAPP, to distinguish the exact error for these cases , for that, two error codes are defined in the smcapp :'0x18' is stand for end user power off, '0x8' is stand for end user is out of the service area . SMCAPP will judge the error code by the reponse time from CMAP server . if the error code send to CMAP server is '0x18' ,the CMAP server will return error code 34 to upper level, it is decide by the program.

Handling Process:

To solve this problem, modify the configuration file 'smscconfig.ini' , change the value of the item 'MSShutDownSpan' to '1' second . after that , if the response is more than 1 second from the wireless environment SMCAPP will only send one error code '0x8' to CMAP server , and then CMAP server can send right error code out. On the other hand,if the wireless environment is better , then, the response is in 1 second , if the cmap error code 34 still exist, then, the finally solution is to develop a patch version for SMSAPP module .

Suggestions and Summary:

It is better for smsapp to have a switch control and use a detailled error information function for the specific application.

5.4 Wrong config in clustermng.ini cause dbdaemon status become slave on both nodes

Title: Wrong config in clustermng.ini cause dbdaemon status become slave on both nodes

ID: SE0000399624

Update time: 2009-06-30 20:33:44

Author: g00116902

Product Family:


Fault Type: Maintance

Keywords: clustermng slave dbdaemon

Digest:


Dbdaemon status become slave on both nodes.When the smc system start,the dbdaemon module always slave. After restarting or switch over to another node , the dbdaemon status is the same.



52

Alarm Information:

[2009-05-18 15:27:13] dbdaemon ShakeHand exception,so -- [2009-05-18 15:27:13] dbdaemon process pid = 9724 exception, now Restart it [2009-05-18 15:27:21] normal kill dbdaemon pid=9724 [2009-05-18 15:27:21] ---start dbdaemon process:--- [2009-05-18 15:27:21] --start dbdaemon process OK pid: 9826--

Cause Analysis:

When someone modify the sequence or delete some section in Clustermng.ini, he should make sure the ShareMemID is the right one. Because Clustermng use this to control the app,if it is not right,the corresponding module will has problem.

Handling Process:

1,Check the log of Clustermng,found there are errors about dbdaemon module; 2,Check the recent operation,on site engineer modified the Clustermng.ini to exclude L2cache module; 3,Compare the clustermng.ini with original one,the ShareMemID of DBDaemon should be set to 134,but now ShareMemID = 141.The engineer must has made some mistake when modified the clustermng.ini file; 4,After modification to make the ShareMemID to the right one,restarted the clustermng,it is normal now.


When someone modify the sequence or delete some section in Clustermng.ini, he should make sure the ShareMemID is the right one.

5.5 Customer Complains That Send Only One SM But Was Charged Many Times

Title: Customer Complains That Send Only One SM But Was Charged Many Times

ID: SE0000398392

Update time: 2009-06-29 11:55:34

Author: liguanying

Product Family:


Fault Type: Service

Keywords: recurrent route charged many times

Digest:


On-site engineer was informed by customer that some subscribers send only one message but was charged many times. On-site engineer checked the history database by MT console, and found that within one minute the message was delivered 20 times. And checked MO bill in the same time segment, there are also 20 same messages was submitted.

Alarm Information:

Null.

Cause Analysis:

Since the same message was submitted to SMC many times, so we can exclude SMC's problem. And the message flow is subscriber submit SM from CDMA network and then through SMSGW reach to SMC. Follow the message flow we found that the problem was caused by SMSGW route. There is a existed route for number segment "601" to dest account which connect with



53

SMC. And on-site add another route for number segment "60" for SMC to send messages to CDMA network (CDMA Number segment is "6014" and "6013") through SMGW. So when CDMA messages reach to SMGW, the dest address will match route for "601" and then back to SMC again.

Handling Process:

Remove the route for number segment "60", and add two route for number segment "6013" and "6014". So message to CDMA will not match the route for "601" and back to SMC again. Problem was solved


When add route in SMGW please be careful, and check if there may be any recurrent route.

5.6 CDMA(SMPP) Error Code Statistics Result Doesn't Contain any Data

Title: CDMA(SMPP) Error Code Statistics Result Doesn't Contain any Data

ID: SE0000396277

Update time: 2009-06-29 11:54:24

Author: Li Guanying

Product Family:


Fault Type: Report/I2000

Keywords: CDMA SMPP Error Code Statistics DestIFType

Digest:


When customer login SMC Report INRPT client to quiry "templates->Failure Cause Detail List->CDMA(TDMA) Error Code Statistics" ( or SMPP Error Code Statistics), the report table showsnothing, doesn't contain any data. All fields show blanks..

Alarm Information:

Null

Cause Analysis:

After check the stor procedure used by these two templates, I found that SMC Report use a filed named DestIFType in SMC bills to identify different network type. But DestIFType is a optional field in MO/MT/History bill. On-site didn't output DestIFType in bills, so when bills were statisticed into database, as Prestat system's default configuration, this filed was set to "0" which refer to GSM. (1:CDMA 2:TDMA 3:SMPP)

Handling Process:

Modify SMC config file billloacal.ini (under path smc/config) as below to out put DestIFType field in MO/MT/History bill. Need restart billcreater to make it effective. billloacal.ini [MoBill] IsWriteDestIFType = 1 [MtBill] IsWriteDestIFType = 1 [HistoryBill] IsWriteDestIFType = 1



54


Null.

5.7 SMSGW Got Error 8 Because Charging Problem

Title: SMSGW Got Error 8 Because Charging Problem

ID: SE0000396338

Update time:

2009-06-29 11:54:12

Author: chen lisheng

Product Family:


Fault Type: Service

Keywords: SMPP error 8, system error

Digest:


When tracing an SP service code on the interface between SMSC and SMSGW, found there are some deliver failed messages, the error code is 8. This is an error code from smgw, so that we should check why smgw return such an error.

Alarm Information:

Null

Cause Analysis:

1: I tried to trace all follow, and I found when SMSGW send authenticate request to MDSP, MDSP feed back an error code "3108", which means charging failed on MDSP side. 2: Ask the OCS engineer to check the balance of the Calling subscriber, the balance was not enough for an SMS, So this is the reason why we get error 8 between SMSC and SMSGW

Handling Process:

This is not a problem of our system, the subscriber just need to recharge and then he will be able to use the service.


Trace the message is very helpful when solving the problem on SMSC and SMSGW

5.8 Incrrect Configuration Caused SMSC MNP Function Test Failed

Title: Incrrect Configuration Caused SMSC MNP Function Test Failed

ID: SE0000397805

Update time:

2009-06-29 11:53:27

Author: gaoyantao

Product Family:




55

Fault Type: Shortmessage Center

Keywords: IMSI MNP mo error

Digest:


As the requirement we made upgrade for the SMC system to support MNP function(originating IMSI authentication), during the test we found that all of the number belong to this operator can pass the test, but the numbers belong to other operator in this country will face originator authentication error

Alarm Information:

Null.

Cause Analysis:

From the SMC interface trace we found that when test use local operator number, the SMC will send "VipSRIDeliver" request to query the IMSI from HLR, but for the number belong to other operator, there is no "VipSRIDeliver" at all, when recieve "MoSubmit", directly give "MOSubmitError". So we doubt that there is configuration regarding the operator number segment, we checked the configuration file found that there is configuration in "smscconfig.ini" file, we only configured the "[MobileHead]""GsmMobileAddrHead" for the operater itself.

Handling Process:

Configure the "[MobileHead]""GsmMobileAddrHead" in the "smscconfig.ini" file for all of the operators in this country, and restart the smcapp process, then test is OK


For SMC version which can support MNP function, the "[MobileHead]""GsmMobileAddrHead" in the "smscconfig.ini" file need to configure the number segment for all of the operators in this country

5.9 How to Solve Unknown Protocol When Tracing Interface Between Map and Mtiserve

Title: How to Solve Unknown Protocol When Tracing Interface Between Map and Mtiserver

ID: SE0000398105

Update time:

2009-06-29 11:53:19

Author: Zhuyi

Product Family:



Keywords: Unkown protocol

Digest:


When tracing interface between map and mtiserver, the output are many "Unknown protocol". See the picture attached.

Alarm Information:

NULL

Cause Analysis:

It is because I didn't choose the right byte order

Handling Process:

Choose the "Windows Order" in trace window and then the protocol will showed in right description.



56


Null.

5.10 DBDAEMON is Disconnected With The SMSC

Title: DBDAEMON is Disconnected With The SMSC

ID: SE0000397898

Update time:

2009-06-29 11:53:05

Author: 00707667Muneeb Imran

Product Family:


Fault Type: Database

Keywords: Dbdaemon disconnect

Digest:


DBDAEMON is also present on the same server, ON the DB DAEMON I saw the BadLinkNUm=4. I checked the configuration file dbdaemonconfig.ini file and name of the SMSC and its IP was correctly entered there, still it was showing BadLinkNum=4.

Alarm Information:

NULL

Cause Analysis:

The password of "sa" in sqlserver has been changed

Handling Process:

After Running the Query Analyzer & Changing the Password (setting the correct Password) the issue for resolved.


Null.

A&S Products Cases Chapter 6 infoX-WISG Cases


57

Chapter 6 infoX-WISG Cases

6.1 Network cable of SUN mini computer ce1-ce4 could not display correct IP address

Title: Network cable of SUN mini computer ce1-ce4 could not display correct IP address


After connection of network card according to IPMP configuration manual, we found network cable ce1-ce4 could not display the correct IP address.

Alarm Information:

Null.

Cause Analysis:

In the file /etc/path_to_inst, there was network cable ce1-ce4 , and also ce6 -ce9 .We configured network cable ce6-ce9 according to IPMP configuration manual, and the IP address could be displayed normally. So one network cable corresponded to two network cable names. This might be caused by OS installation problem. Initialization of the file /etc/path_to_inst could solve this problem.

Handling Process:

1. Get access to the OK mode. 2. Use the boot cdrom -s command to start from the optic disk, and access single user mode. 3. Use format to view the first disk, and press ctrl + d (suppose the first disk is c1t0d0s0)4. Run mount /dev/dsk/c1t0d0s0 /a 5. cp /etc/path_to_inst /a/etc/ 6. reboot -- -r


Add the problem in the guide to IPMP configuration or SUN mini computer troubleshooting. This problem is common with 5 devices in 17.

6.2 The name of the fold of wapgw dual-system script is incorrect

Title: The name of the fold of wapgw dual-system script is incorrect


After the wapgw cluster script is decompressed, the directory is /etc/WAPGW. The adding of the wapgw resource fails, <mailto:root@wapgw2> root@wapgw2 # scrgadm -a -g oracle_rg -j plat-app -t HW.WGW -y Resource_dependencies=oracle_server VALIDATE method failed -- check syslog for error messages

Alarm Information:

Null.

Cause Analysis:

For such problem, we often view the log; but we cannot find any useful information. This may be caused by the incorrect cluster script, existed resource name or resource dependency.

Handling Process:

1. Confirm that the uploading mode is correct. 2. Confirm that other resource groups are correct. 3. Confirm that the following operations are performed.



58

Dual-node run script. You can copy the HWWGW directory to the /opt as root. # cp –R HWWGW /opt # cd /opt/HWWGW/etc # cp HW.WGW /usr/cluster/lib/rgm/rtreg/ # cd /opt/HWWGW # chmod –R +x * 4. Confirm that in the two devices, the data related with the wapgw cluster script are the same and then use the following command: scrgadm -a -g oracle-rg -j plat-app -t HW.WGW -y Resource_dependencies=oracle_server; The system prompts the same failure information. 5. Based on the experience, change the /etc/HWWGW directory of wapgw cluster script into /etc/WAPGW and then use the command to add resource to solve the problem.


The name of the fold of wapgw cluster script must be /etc/HWWGW.

6.3 New wapgw users cannot use VI to edit configuration files

Title: New wapgw users cannot use VI to edit configuration files


New wapgw users cannot use VI to edit configuration files because there is ^M mark in the bills.

Alarm Information:

New wapgw users use VI to edit configuration files. The message "vi permit" appears. There is ^M mark in the bills.

Cause Analysis:

The error information shows that vi does not permit /var/tmp/ExQtaiME. Check the property and owner of wapgw user.

Handling Process:

Check the /var/tmp/ directory, finding tmp property is root other. In the read-only authority. Log in to chown root:sys tmp as the root user and then modify chmod 755tmp, validating the .cshrc environment variable.


Check the /var/tmp/ directory, finding tmp property is root other. In the read-only authority. Log in to chown root:sys tmp as the root user and then modify chmod 755tmp, validating the .cshrc environment variable. Restart the monitor& process. Now you can use vi to edit configuration files. Bills are in a correct format.

6.4 MMS can not be sent

Title: MMS CAN NOT BE SENT


MMS subscribers can not post MMS and every time tier post massage is rejected by the MMSC because there is no MSISDN forwarded inside the post massage.

Alarm Information:

The logs inside the MM1 server in MMSC show that there is no MSISDN number forwarded.

Cause Analysis:

Once the packet is captured between wapgw and MMSC, the MSISDN is forwarded inside a cookie format. Once the trace was taken to the http process, this massage was appeared: [03:07:02:661][331][FINEST]: Can't Find UserInfo Key=10.246.231.97:1, get UserInfo



59

Cancel That is mean the forward type value in URLINFO table is 1 and it should be 0.

Handling Process:

As a solution: 1.) Execute this sql commands after login DB using username wapgw: SQL>update sysconfigint set configvalue=0 where ConfigField=2; SQL>commit; SQL>update URLInfo set ForwardType=0; SQL>commit; SQL>update ServerZoneTable set ForwardType=0; commit; 2.) Restart the http process.


We should know well of the configuration in config file and database.

6.5 The Oracle_listener resource's status is fault.

Title: The Oracle_listener resource's status is fault.


After installing Oracle DB in the Cluster, use the command ‘scstat’ finds that: Resource: oracle_listener wapgw1 Offline Offline Resource: oracle_listener wapgw2 Start failed Faulted

Alarm Information:

Null.

Cause Analysis:

When we try to startup listener using the command ‘lsnrctl start’, show us the error message like this:

…

System parameter file is /opt/oracle/product/9.2/network/admin/listener.ora Log messages written to /opt/oracle/product/9.2/network/log/listener.log Error listening on: (DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=wapgw1)(PORT=1521)))

…

Handling Process:

Open the file /opt/oracle/product/9.2/network/admin/listener.ora& /opt/oracle/product/9.2/network/admin/tnsname.ora Change the host IP address to the share-ip in both nodes. Then see the status Resource: oracle_listener wapgw1 Online Online Resource: oracle_listener wapgw2 Offline Offline Everything is OK.


This is a common problem that when we install the Oracle DB. Also when we change the IP address of the Host, we should change these two files for the Oracle DB.

6.6 Database jobs fail to start

Title: WAPGW jobs fail to start.



60


WAPGW jobs fail to start.

Alarm Information:

Execute NTAUTHORITY\SYSTEM. Fail to create index on TBL_PPG_DETAIL_20050825. Therefore, this table does not exist in the database WAPGW. [SQLSTATE 42S02] (error 1906) Operation failed.

Cause Analysis:

After the check, the engineer found that the table existed in the database and the name was correct.

Handling Process:

In st_worklog, check whether TBL_PPG_DETAIL_20050825 is processed. If it is processed and problems occur in the process of loading into the database, check the pre-statistics database setting, finding IP is incorrect. Modify it to solve the problem.


None

6.7 Processes fail to connect the database due to the owner error of oracle user

Title: Processes fail to connect the database due to the owner error of oracle user


The http process fails to connect the database.

Alarm Information:

DB error

Cause Analysis:

If the installation and configuration of oracle are correct, the owner of oracle user may be incorrect; causing the process cannot find the database.

Handling Process:

After the check, find the owner of oracle user is not dba, but other Delete the oracle user, userdel oracle Create a user again groupadd –g 101 dba useradd -u101 -d/opt/oracle-g dba -s/usr/bin/bash –m oracle passwd oracle


None

6.8 Process Constant Restart Due to Format Error in Configuration File

Title: Process Constant Restart Due to Format Error in Configuration File


Portal and MDMC processes constantly restarted, and there was prompt information as follows:

Alarm Information:

[Wed Jul 6 19:50:50 CST 2005] Request http://localhost:8080^M/bone.jsp fail (1/3). [Wed Jul 6 19:52:29 CST 2005] Request http://localhost:8080^M/bone.jsp fail (2/3). [Wed Jul 6 19:54:08 CST 2005] Request http://localhost:8080^M/bone.jsp fail (3/3). [Wed Jul 6 19:54:23 CST 2005] Web server restart fail. [Wed Jul 6 19:56:02 CST 2005] Request http://localhost:8080^M/bone.jsp fail (1/3).



61

[Wed Jul 6 19:50:50 CST 2005] Request http://localhost:8080^M/bone.jsp fail (1/3). [Wed Jul 6 19:52:29 CST 2005] Request http://localhost:8080^M/bone.jsp fail (2/3). [Wed Jul 6 19:54:08 CST 2005] Request http://localhost:8080^M/bone.jsp fail (3/3). [Wed Jul 6 19:54:23 CST 2005] Web server restart fail.

Cause Analysis:

File httpd.conf format error.

Handling Process:

Run command dos2unix httpd.conf in the directory /home/portal/portal/conf/ of httpd.conf. Change the file from the dos format to the unix format. Then, restart Portal and MDMC.


None

6.9 Realizing Efficient File Download in Case of Network Packet-Loss

Title: Realizing Efficient File Download in Case of Network Packet-Loss


In the test of downloading multimedia service, cell rerouting occurred in the process of downloading files, and downlink packet-loss might occur (in normal and standard flow). The core network did not resend the lost packet, so the resend was in the charge of upper-layer WAP/WTP (In ftp download, TCP was in charge of the resend).If the packet-loss was server, the download was ineffective and the speed was low.

Alarm Information:

The successful rate of download is low, and the speed is low.

Cause Analysis:

In the process of downloading multimedia service, cell rerouting occurred, and downlink packet-loss might occur (in normal and standard flow). The core network did not resend the lost packet, so the resend was in the charge of upper-layer WAP/WTP (In ftp download, TCP was in charge of the resend).WAP multimedia packet is transferred through WTP by dividing the file group by group. The handset should confirm every group of data. When the handset receives the data, it returns ACK message. If the receiving is abnormal, it returns NACK message, and inform the network to resend the data. The time to return NACK message is determined by timer T1. (See the attached figure.)When the network finishes sending data, it will resend the packet if there is no response from MS in a specified time. The time is determined by timer T2.

Handling Process:

The default value of resend timer in Huawei WAPGW is 25s. You can change the time to a short one in T2. The suggest value is 3-6 stop shortens the waiting time for resend, change file wapgw_wap.ini: #Base retransmission time interval (seconds) BaseReTransInterval= 25 Change it to 5s.


In most occasions, there is no need to change the time for protocol stack resend. In some special cases, if the carrier has special requirements, we can change the parameter to improve the efficiency.



62

6.10 The wapgw http module often exits abnormally and core file is generated

Title: The wapgw http module often exits abnormally and core file is generated


The wapgw_http module often exits abnormally and core file is generated.

Alarm Information:

http uses comparatively high memory, causing the generation of core file and the restarting.

Cause Analysis:

Such case is caused by the high usage of memory. At present, there is only a workaround available to the field engineers. Modify the parameters of monitor configuration file to disable HTTP and then re-enable it after HTTP reaches 700M. In this way, COREDUMP caused by the high usage of memory can be avoided. >>>AppType[1] =Wap2.0 >>>ProcName[1] =wapgw_http >>>AdminProcessStatus[1] =1 >>>CheckDieLock[1] =0 >>>CheckIpaddress[1] =172.29.7.70 >>>CheckDieLockPort[1] =9128 >>>CheckThreadCount[1] =0 >>>ThreadCountKill[1] =0 >>>ProcMemFlag[1] =1 //pay attention to >>>ProcMemInfo[1] =700 //pay attention to >>>ProcStartTime[1] =2 >>>ProcEndTime[1] =3 >>>ProcMaxFailCount[1] =20

Handling Process:

Modify the ProcMemInfo value (memory used by process) to make mitigation. When the memory used by the http process reaches this value, the system will disable this process. This is only a workaround.


ProcMemFlag [1] parameter 1 indicates the system only monitors processes. 2. indicates the system monitors processes and time.

A&S Products Cases Chapter 7 infoX-MMSC Cases


63

Chapter 7 infoX-MMSC Cases

7.1 Configuration for Working With MMBOX With Timeout and Expiry Time

Title: Configuration for Working With MMBOX With Timeout and Expiry Time

ID: SE0000334084

Update Time: 2008-06-24 08:53:20

Author: g00703939

Product Family:

Data Service Product: MMSC

Fault Type: MMS SERVER

Keywords: MMSC MMBOX timeout expiry time

Digest: null


According to mmsc guide we have the option for sending the MM to MMBOX in two cases: the message reaches a time out or the message expires.

Alarm Information:

null

Cause Analysis:

When we configure the parameters like the following for getting this behavior: SysMMExpiryTime 172800 seconds (Time for Message expiry, 48 hours) SrvMMExpiryStrategry forward to TGW SysMMTimeoutTimer 7200 seconds (Time for message timeout) SrvDeliverToLegacyStrategy Forward to TGW SrvSYSForwardToTGWStrategy Forward to SP But the mmsc is only sensing the expiry time so we will receive the message in the MMBOX in 48 hours, it should be in 2 hours. You can see this behavior in the server logs in MMSC.

Handling Process:

The reason for this behavior is the configuration for MMSC. We need to make some changes. We need configure paramaters that we mention before and also the following: SrvNotifyRespTimeoutStrategy from 0 to 1 SrvRetrieveTimeoutStrategy - - - from 0 to 1 SrvConsultPersonalForTGWFlag - - from 0 to 1 SrvMTErrorUserAgentProcAsNoMMSTerm - - from 0 to 1 We can not see these parameters directly by the MMSC portal so we modify these ones directly in the database and then we restart the mmsc application.


After we made those changes we receive the MM in MMBOX after 2 hours. The test is like this, so if the mobile can not retrieve the MM in 2 hours the message goes to MMBOX and the MMSC continue trying to send MM for the specified expiry time(48 hours). If after 48 hours the Mobile not retrieve the MM, this one expiry but the subscriber can



64

see the MM in MMBOX.

7.2 PPSAgent Not Running in MMSC

Title: PPSAgent Not Running in MMSC

ID: SE0000334418

Update Time: 2008-06-24 08:53:13

Author: Md. Saifur Rahman 81401

Product Family:



Keywords: PPSAgent, mmsc

Digest: null


After finishing MMSC installing, when try to connect with IN, found for pps user no request sent to IN. Then when check by #p.sh found

Alarm Information:

No request sent to IN.

Cause Analysis:

PPSAgent module not running. Then check databse and found there is no entry for PPSagent in "MODULES" & "MODULES_INNER" table.

Handling Process:

mmsc>sqlplus mmsc/mmsc SQL> insert into MODULES values(13,'PPSAgent', 13, '10.164.78.196', 38023, 'PPSAgent','server',0,0); SQL> insert into modules_inner values(13,'PPSAgent', 13, '10.164.78.196', 38023, 'PPSAgent','server',0,0); SQL> commit; here 10.164.78.196 is float IP of MMSC server is name of host Chech the table: SQL> select * from moudles where moduleid=13; Then restart mms service. #mms stop #mms start #mms status


Null

7.3 Incorrect Parameter Configurations for MCAS Leads to Unvvailable MCAS Services

Title: Incorrect Parameter Configurations for MCAS Leads to Unvvailable MCAS Services

ID: SE0000332217



65

Update Time: 2008-06-03 11:15:11

Author: ichenyang

Product Family:


Fault Type: Other

Keywords: [email protected];[email protected]

Digest: null


When different type of handsets send mms each other,MMSC do not forward this mms to MCAS server .if UAprofileof received handset do not support this type of mms ,it can receive but display abnormally.

Alarm Information:

if mms is sent ,there are no any message between MMSC and MCAS.

Cause Analysis:

According to appearance ,elementarily we can judge there are some incorrect configurations for MCAS in mmsc system. because MMSC do not forward.so we need to check configuration referring to MCAS in mmsc side.There are two incorrect parameters as follow:

1. SrvIfPrejudgeForContentAdapt－－"0" means unavailable coneteadapt ;"1" means available coneteadapt ;

2. RESERVEDSETTING－－"0" means unavailabe coneteadapt for subscriber;"1" means availabe coneteadapt for subscriber.this parameter is the sixth bit. for example 0000002 means availabe coneteadapt for subscriber;0000012 means availabe coneteadapt for subscriber.

Handling Process:

Modify parameters following: 1. update systemparameter set paramvalue='1' where paramname = 'SrvIfPrejudgeForContentAdapt'; 2. update subscriber set RESERVEDSETTING = '0000011' where phonenumber='+3801112222'; 3. commit; 4. restart mmsc processe;


There are 5 parameters referring to MCAS in MMSC side as follow: 1. SrvContentAdaptMode in systemparameter table; 2. SrvIsCntAdaptForSvcFlow in systemparameter table ; 3. SrvIfPrejudgeForContentAdapt in systemparameter table; 4. SysMM1NotifyURLPrefixInProxyAdapt in systemparameter table; 5. phonenumber in subscriber table.

7.4 MMSC Could Not Send Dr To Local Users After Migrating To Mdsp Charging

Title: MMSC Could Not Send Dr To Local Users After Migrating To Mdsp Charging

ID: SE0000326765

Update Time: 2008-04-30 11:39:31

Author: 84455anfaz



66

Product Family:



Keywords: null

Digest: null


The MMSC has interconnected with MDSP for the charging. It has been noted that since the migration MMSC is not sending DR to the local senders. When the MMSC server svc logs are checked no MM1_DeliveryReport_Requests were found. But in the MMSC CDRs for the DR can be found.

Alarm Information:

null

Cause Analysis:

MMSC will send DR charge request to MDSP for charging ,but MDSP does not support such a charge request therefore feedback the result code 2100 (Service does nont exist) to the MMSC. The MMSC will treat this as Charging Failed. Therefore the DR will not be forwarded to the local sender.

Handling Process:

The solution is to cancel the DR charge request sent to the MDSP. This can be achived by cancelling the charge point(4101) for DR in the MMSC DB. Note: The DR CDRs will not generated if we cancel this charge request for the DR.


null

7.5 The Value of 18th Field of MMSC CDRs is Incorrect

Title: The Value of 18th Field of MMSC CDRs is Incorrect

ID: SE0000323443

Update Time: 2008-04-02 20:05:10

Author: Li Haifeng

Product Family:



Keywords: null

Digest: null


According to the CDR description, 0 value of 18th field is Sending to MMSC succeeded. But if you look into to CDRs you will find that: 1. There is no P2P CDR with 0 value in 18th field. 2. Only P2E CDRs have 0 value in 18th field.

Alarm Information:

null

Cause Analysis:

The description of the 18th field of MMS bill CDR like this: 18 MM Send Status: 0: Sending to MMSC succeeded (In case of one MMSC, it indicates that the MM is successfully sent to the MMSC. In case of two MMSCs, it indicates the originating MMSC has successfully sent the MM to the terminating MMSC). 1: Reception succeeded.



67

2: Rejected by Recipient. 3: MM successfully transferred to the LSS (Legacy Support System) 4: MM expired. 5: Routing forward MM failed (In case of two MMSCs involved, the value of STATUS CODE in the MM4_forward.RES message is error). 6: Rejected by the system (e.g. blacklist restriction, illegal message interception transfer)7: Unknown error. According to the CDR description, 0 value of 18th field means message sending to MMSC succeeded. But we found that in the CDR status is abnormal : 1) There is no P2P CDR with 0 value in 18th field. 2) Only P2E CDRs have 0 value in 18th field. We study the CDRs generated in MMSC and find that all the T CDRs generated firstly then O CDRs . That means the charge policy configured as charge by delivery not charge by send. And there is no status "0" in P2P flow if the charging policy is "charge by delievery". The system parameter SysDirectChargePolicy control the charging policy if MMSC not charged from MDSP ,or else the charging policy configured in MDSP . The current charge policy is "charge by delievery" according to the CDR generation sequence . And MMSC charge through MDSP . So we should check the configuration of MDSP.

Handling Process:

We check the MDSP configuration of MMS charging policy : "charge by send" for message sending ; "charge by delievery" for message receivering ; So that's the problem . The problem solved after we modify all the charge policy to be "charge by send" . The O CDRs generated firstly and the value "0" appears in 18th field.


The CDR status is related to charge policy and we can find the charge policy according to the CDR generation sequence.

7.6 How to Solve The Access of MMBox Portal Becomes Slowness

Title: How to Solve The Access of MMBox Portal Becomes Slowness

ID: SE0000398414

Update Time: 2009-06-29 14:33:38

Author: chaisak

Product Family:


Fault Type: Others

Keywords: mmbox keepalive

Digest: null


Hardware: ATAE Application: infoX MMBox Multimedia Message Box V1.2D110_01 Customer found that there is no response when they try to access MMBox portal. After MMBox was restart process, it still cannot be accessed.

Alarm Information:

null

Cause Analysis:

1. We check connection status by using netstat -an | grep 2020 Log show as following; *.2020 *.* 0 0 24576 0 LISTEN 192.168.7.56.2020 192.168.7.56.52759 32768 0 32768 0



68

CLOSE_WAIT 192.168.7.56.2020 192.168.7.56.52874 32768 0 32768 0 CLOSE_WAIT 192.168.7.56.2020 192.168.7.56.53015 32768 0 32768 0 CLOSE_WAIT 192.168.7.56.2020 192.168.7.56.53034 32768 0 32768 0 CLOSE_WAIT 192.168.7.56.2020 192.168.7.56.53055 32768 0 32768 0 CLOSE_WAIT 192.168.7.56.2020 192.168.7.56.53337 32768 0 32768 0 CLOSE_WAIT 2. This problem happens because the alive http connection of MMBox has reached the maximum. The new access of portal will waiting until some of the connection has been released. So it will takes long time. In MMBox, the default maximum of alive http connection is 20. It’s configured in “mmbox.cfg”. <server> <keepalive-max>20</keepalive-max> <keepalive-timeout>120s</keepalive-timeout>

Handling Process:

When MMSC forwards the MM to MMBox with command mm7_submit.req, one http connection will be alive until MMBox returns the mm7_submit.res. Since the traffic that MMSC forwards to MMBox is much high at the busy time, MMBox takes long time to process the message. The alive http connections between MMSC and MMBox reach 20. It causes MMBox can not active the new http connection. Since we can not control the traffic from MMSC now, the available solution is increase the allowed alive http connection by change parameter “keepalive-max”. We increase “keepalive-max” to 300 in mmbox.cfg. <server> <keepalive-max>300</keepalive-max> <keepalive-timeout>120s</keepalive-timeout> Then MMBox allow 300 alive http connections. There will be always has enough free connection when subscriber access the MMBox portal. Operation Procedure: 1.Backup “mmbox.cfg” -Login MMBox server as user mmbox and switch to “cfg” folder, backup mmbox.cfg:cp mmbox.cfg mmbox_20080715.cfg 2.Modify “mmbox.cfg” -Modify “mmbox.cfg”, update these three parameters thread-min, thread-max, thread-keepalive to be 150, 400, 300. <thread-pool> <thread-max>400</thread-max> <spare-thread-min>150</spare-thread-min> </thread-pool> <min-free-memory>1M</min-free-memory> <server> <keepalive-max>300</keepalive-max> <keepalive-timeout>120s</keepalive-timeout> <http id="" host="*">



69

<port>${Var["httpPort"]}</port> </http> <host id=''> <document-directory>webroot</document-directory> <web-app id='/'> <servlet-mapping url-pattern="/servlet/*" servlet-name="invoker"/> </web-app> </host> 3. Restart MMBox -Login MMBox server as user mmbox and stop MMBox process by command: mmbox stop -The mmbox process will be started by the monitor. Check the MMBox status by command: mmbox status


Please ensure your new parameter every times before you restart process in order to effect live system.

7.7 MMSC cannot connecto to MDSP

Title: MMSC cannot connecto to MDSP

ID: SE0000388103

Update Time: 2009-04-24 15:18:26

Author: huang hao/00136633

Product Family:



Keywords: MMSC MDSP DSMP

Digest: null


After all the configurations on mmsc and mdsp, check the interface connections between them using command netstat -an|grep 10011. There are no other output except the port 10011 is listening. we got the error messages in log files of mdcc. customer wants me to solve the problem as soon as possible.

Alarm Information:

null

Cause Analysis:

Check the log files in cmanager.log of the mdcc module [09:16:26:148, Info, 1076212512]: 3105: print info|Receive authenticate message from MMSC(ID=226001). [09:16:26:148, Info, 1076212512]: 3105: print info|auth failed [09:16:26:148, Error, 1076212512]: 534: Entity Auth Failed|Entity : MMSC, ID : 226001 [09:16:36:150, Error, 1076212512]: 159: sock error, failed to recv|read (FD=36) failed, errno=0. [09:16:36:169, Error, 1076212512]: 522: Entity connection is closed.|Entity name : MMSC, ID : 226001 it shows that the authentication for the client mmsc fails. that must be the problem of configuration on mdsp or mmsc side. check those



70

configuration items.

Handling Process:

Using command "mms list dsmp" will show all the configuration itmes for dsmp in mmsc 60 mmsc01 :/home/mmsc2>mms list Dsmp 61 mmsc01 :/home/mmsc2> -------------------------------- CSDSMPResTimeout = "5" means: DsmpAgent is timeout with Dsmp After CSDSMPResTimeout seconds..High-level CSDsmpAuthWithEmail = "0" means: Whether encode Email Address field in DSMPAuthPrice_Req when send MOET or EOMT message..High-level CSDsmpInteractInFOAT = "0" means: When FOAT flow, whether interactive with dsmp ..High-level CSSupportDsmpIMSI = "0" means: Whether encode OARelationID field in DSMPAuthPrice_Req ..High-level DsmpChgTimeOut = "15000" means: After DsmpChgTimeOut milliseconds charging server socket time out..Restart DsmpForwardAndCCSPID = "12345" means: The Forward and CC service spid..Restart DsmpForwardAndCCServiceID = "12345" means: The Forward and CC service serviceid..Restart DsmpIPAddr = "192.11.200.67" means: DSMP server ip address..Restart DsmpMMSCID = "4900" means: Msg source device ID is charging server device ID..Restart ....... -------------------------------- found 19 matched parameters. the item DsmpMMSCID is a default value, which is needed to be modified to the real one 226001 which is desigated in license file. use command "mms set DsmpMMSCID=226001" to set it. And then restart the mms processes to refresh the configuration.


The modification of this item is not specified in the commissioning guide of MMSC. We need to be familiar with the general configurations of MMSC especially the commands "mms list ***" and "mms set ***". Check them by our own and set right values.

7.8 Wrong Configuration Cause The New Subscriber Can Not Send And Receive MMS

Title: Wrong Configuration Cause The New Subscriber Can Not Send And Receive MMS

ID: SE0000374144

Update Time: 2009-02-05 17:31:20

Author: w00126593

Product Family:



Keywords: mms number of users limited

Digest: null


The old user sned and receive mms normally but the new subscriber can not send and receive MM.when user A send mms to new user B,get the alarm from phone:"can not get the valid data of receipt" ,check the UPWS portal,the user B,is not the registered



71

user ,regist the number B manually get the alarm "the number of users excceeds the maximum limited by the license. "

Alarm Information:

"can not get the valid data of receipt." "the number of users excceeds the maximum limited by the license. "

Cause Analysis:

from the alarm ,it must license file or some parameter of license limited the registration of new users. check the license description file of license ,find that :UserNumber:-1(-1 denote no limit) then use the command :mmsd list lic find that there is a parameter :LicSysSubsNumberLimit = "100" means the max number of register subscriber is 100.

Handling Process:

use the command :mmsd set LicSysSubsNumberLimit =-1 means no limited. then restart the MMS process , UPWS and UPSM portal. the new subscriber can register automatically.all the user can send and receive the MM


some parameter the document don't refer to should be checked ,during debugging period.

7.9 The Problem That Some Prepaid Subscriber Can Not Send MMS

Title: The Problem That Some Prepaid Subscriber Can Not Send MMS

ID: SE0000362423

Update Time: 2008-12-08 18:04:30

Author: chenyongfang

Product Family:


Fault Type: Others

Keywords: MMSC APN charge IN

Digest: null


Customer complained that some subscriber can not send mms.

Alarm Information:

Null

Cause Analysis:

1.Set mmsc server log to 7 level, Trace the log in MMSC when the subscriber which have problem sending mms. It can not see any log in MMSC server. 2.Use WISG SDM tool trace the message when send mms. the accounting stop request message appear fast after the accounting start request message. No message sends to MMSC. The subscriber offline quickly.(find the attachment for detail information) ,But the subscriber can access the internet normally. 3.Using more test number to send mms, It is found that the post subscriber can send mms, and some prepaid subscriber can send, but some prepaid number can not send. 4.The network did not config MDSP, and The MMSC did not connect to IN, only charge by bill. 5.According to the analysis hereinbefore. The mmsc system is no problem. It is because of other net unit. Contact the GGSN engineer to trace the message. It found that the



72

GGSN send the message to IN system failed. 6.Check the subscriber attributer in HLR. Comparing the normal number. We found the prepaid subscriber have config GPRS-CSI. This parameter means GPRS trigger to IN for charged online. It is because the customer changes the system for GPRS service online charge. 7.The IN engineer check the log in IN system. It found charging failed when the prepaid subscriber send mms. Because when subscriber sends mms also have GPRS traffic. But GPRS service and MMS service use different APN. IN system no charge rule for MMS service APN, only config the charge rule for GPRS service APN.

Handling Process:

IN system add the charge rule for MMS APN in charge classes mapping table. And config the charge traffic free. Because the MMSC charge by item. Then the prepaid number GPRS traffic that generated by sending mms charge is no problem .the prepaid number send mms successful. The problem is resolved.


We should contact other net unit that refering to mmsc to resolve the problem together when we can not find the problems in our system.

7.10 Direct Push Message Submit Fail

Title: Direct Push Message Submit Fail

ID: SE0000365261

Update Time: 2008-12-19 11:45:53

Author: s92547

Product Family:


Fault Type: Others

Keywords: Invalid Destination Address^Direct Push

Digest: null


On an MMSC site overseas, direct push backup solution is adopted . MMSC version is MMSC-OVSV100R002D603.The oppsite SMSC belonged to Alctel.When MMSC submit push message ,SMSC returns :Invalid Destination Address"

Alarm Information:

Null

Cause Analysis:

1. Before debuging the direct interface ,we have already finish debug the wap push interface .so it is better to compare the direct push message to wap push message and try to find the difference.

2. Capture the package for wap push 和 direct and compare the two package using ethereal.Four differences were found ,for detailed information ,please refer to the attechment 3. According to the 4 differences ,change the parameters one by one and test the Direct Push service,then found out the reason .One "+" was added before the called number,but for wap push no "+" was added.

summarize：

The control parameter for Push receiving msisdn is：HttpMsisdnFormatTypeForPush

0.means [+] + [country code] + [msisdn] 1.means [country code] + [msisdn]



73

2.means [msisdn]

Handling Process:

Change the system parameter "Push receiving msisdn",set it 1,that means ,the called number in push message use the format : [country code] + [msisdn],then restart mmsc.


Null

A&S Products Cases Chapter 8 I2000 Cases


74

Chapter 8 I2000 Cases

8.1 Database Usage Is Too High - Table space=DCNMTEMPDB_TBS, Client login fails

Title: Database Usage Is Too High - Table space=DCNMTEMPDB_TBS, Client login fails

ID: SE0000394510

Update time: 2009-06-30 17:44:47

Author: Sidarth Shah

Product Family:

Service and Software Public Product: I2000

Fault Type: I2000

Keywords: I2000,High memory usage,DCNMTEMPDB_TBS

Digest:


I2000 server version I2000 V300R001C02B251 iManager I2000 V300R001.2D503 I2000 database is DB2 database. 1 I2000 Client showing Alarm of High memory usage 2 Client login fails and alarms are not displayed /acknowledged/cleared

Alarm Information:

Alarm 1 Name:Database Usage Is Too High Location Information:Host=I2000DB, Database service=db2inst1, Database=IN_OMC, Table space=DCNMTEMPDB_TBS Alarm Source:I2000 Occurrence Time(NT):11/05/2009 09:48:03 Type:QoS Level:Major Clearance:Cleared(Administrator) Acknowledgement:Acknowledged(Administrator) Card ID: Alarm Handling Status: Alarm ID:3 Additional Information:Size=2000MB, Threshold=90%, Usage=90% Identifier: Equipment Alarm Serial Number:1608 Serial Number:1272157 Clearance Time(NT):11/05/2009 10:09:02 Acknowledgement Time(ST):11/05/2009 10:09:10 NE Type:OMC Category:Fault alarm Clearance Category:ADAC Clearance Type:manual clear Object Instance Type: Operation Impact Flag: Alarm 2 When Login to Client - it stops on 95% and then " Fault loading Failed" message comes.Also Database error comes when any query operation done. Complete message which occurs in client window is like this 15/05/2009 16:33:28: Network transmission timed out, try again later please. Alarm station initialization failed



75

15/05/2009 16:33:28: Fault loading failed (On 05/25/2009 08:51:44, Level 3 solution:)

Cause Analysis:

For Fault 1 #su – db2inst1 db2 connect to in_omc Memory usage is high as seen in database Step 1 Check Total pages and used pages for the particular table space by : db2=>db2 list tablespaces show detail For DCNMTEMPDB_TBS db2 LIST TABLESPACE CONTAINERS FOR 13 Tablespace Containers for Tablespace 13 Container ID = 0 Name = /home/db2inst1/i2k_data/REPORT_TEMP_16K_01.DAT Type = File Step 2 Check all process are running and normal by: #../svc_profile.sh #svc_adm -cmd status Fault 2 Step 3: Check log to find connectivity error

Handling Process:

For Increase in Tablespace Follow the below steps A) db2inst1@I2000DB:~> df -k // to check filesystem space Filesystem 1K-blocks Used Available Use% Mounted on /dev/sda2 10490104 4301912 6188192 42% / tmpfs 4154616 0 4154616 0% /dev/shm /dev/sda6 20931964 16013612 4918352 77% /home /dev/sda5 31462264 8397068 23065196 27% /opt B) db2inst1@I2000DB:~> db2 connect to in_omc Database Connection Information Database server = DB2/LINUX 8.2.3 SQL authorization ID = DB2INST1 Local database alias = IN_OMC C) db2 list tablespaces show detail Tablespace ID = 13 Name = DCNMTEMPDB_TBS Type = Database managed space Contents = Any data State = 0x0000 Detailed explanation: Normal Total pages = 128000 Useable pages = 127984 Used pages = 116800 Free pages = 11184 High water mark (pages) = 117664 Page size (bytes) = 16384 Extent size (pages) = 16 Prefetch size (pages) = 8 Number of containers = 1 Minimum recovery time = 2009-05-14-02.11.13.000000 Here according to my calculation current tablespace is 128000 * 16 / 1024 = 2000 MB. D) db2 LIST TABLESPACE CONTAINERS FOR 13 Tablespace Containers for Tablespace 13 Container ID = 0 Name = /home/db2inst1/i2k_data/REPORT_TEMP_16K_01.DAT Type = File



76

E) To increase database size from 2000MB to 2400Mb db2=> alter tablespace DCNMTEMPDB_TBS resize file '/home/db2inst1/i2k_data/REPORT_TEMP_16K_01.DAT' 2400 m) Fault 2 - Login Fails Restarting I2000 application server to renew communication Stopping the I2000 Process After you run the stop_svc command, all the I2000 services are stopped. Procedure Step 1 Log in as the root user. Step 2 Run the following commands: # cd <I2000 installation directory> # . ./svc_profile.sh # stop_svc # stop_daem Step 3 Query for the service status. # svc_adm -cmd status Check whether any service is not running. If the services still run, run the kill_svc command to stop the services forcibly. Starting the I2000 service After you run the start_svc command, all the I2000 services are started. Step 1 Log in as the root user. Step 2 Start the I2000 services. # cd <I2000 installation directory> # . ./svc_profile.sh # start_svc


Summary: For High Memory usage, Tablespace size was increased For Client Problem, Only I2000 Applicationserver was started Suggestion: Keep monitoring i2000 CPU utilization and other parameters.

8.2 Guide to expanding DB tablespace on I2000

Title: Guide to expanding DB tablespace on I2000

ID: SE0000387830

Update time: 2009-05-18 14:36:27

Author: Mr.Wannajak Phoniyom

Product Family:


Fault Type: Customization Service

Keywords: expand DB I2000 perfdb expandb

Digest:


Phenomena: Perfdb table space nearly full OS version: SunOS5.8 I2000 version:I2000V300R001.2Dh04

Alarm Information:

Alarm Detail: generated high database usage alarm in I2000 Check in system Monitor Browser > Database monitor and We found perfdb is nearly full.

Cause Analysis:

DB table space is nearly full and it show alarm in I2000



77

Handling Process:

1.Remote to I2000 login by root 2.cd /opt/huawei/I2000/etc/conf/ 3.find in directory has sacsvc.xml file or not? If it's not. 4.create sacsvc.xml file and add content below to this file. <database name="dbgroup"> <param name="dbTag">default</param> <param name="dbTag">ifms_common</param> <param name="dbTag">sys_master</param> <param name="dbTag">sys_tempdb</param> <param name="dbTag">perfdb_default</param> <param name="dbTag">viewdb_default</param> <param name="dbTag">dcnmapp_tag</param> <param name="dbTag">dcnmtempdb_tag</param> </database> and save it. 5. cd /opt/huawei/I2000/bin/ 6. run command expandb -Dperfdb_default -S2048000 -Tdata -P/opt/sybase/data -D<database tag> -S<size to add (KB)> -T<data|log> -P<path|default> 7. Result should be like below ServerName: SYB,DatabaseName:perfdb,AccessLib:libctl63-md.so,UserName:sa,Password:****** Expand successfuly!


1.If the result is not show success, May be show like below. get database information fail Please check parameter is correct or not? - path : it should be directory that keep the device of DB - database tag : it should be same with sacsvc.xml file.

8.3 How to clear the alarm data from the I2000 database

Title: How to clear the alarm data from the I2000 database

ID: SE0000381662

Update time: 2009-03-31 15:33:56

Author: P.A. Tekla Vijesiri

Product Family:


Fault Type: I2000

Keywords: alarm,I2000

Digest:


we need to delete some alarms from the I2000 database.

Alarm Information:

Null.

Cause Analysis:

Usually,we no need to delete the alarm from the database,it is not recommend,we can remove the alarm in the i2000 interface.if you want to delete the alarm in database,must be careful.

Handling Process:

if you want to delete current alarms from the database 1.log database

78

use alarmdb go delete tbl_cur_alm ------to delete the current alarm go 2.after delete the alarm,restart the FaultService: svc_adm -cmd restartsvc FaultService 3.the up step is to delete the current alarm,if you want remove the history alarm,just change the table name to tbl_his_alm and if you want to delete specified alarm use below command. delete tbl_cur_alm where Csn=alarm sequerence id


Null.

8.4 FAQ-How to screen the snmpagent process of I2000

Title: FAQ-How to screen the snmpagent process of I2000

ID: SE0000350476

Update time: 2008-12-31 14:41:42

Author: XiaoPing

Product Family:


Fault Type: I2000

Keywords: I2000 snmpagent

Digest:


snmpagent process of I2000 is used for north-interface. If the north-interface is not used onsite, the process should be screened. Otherwise I2000 will try to start the process when I2000 is started, if it can not start the process, when check the status with command "svc_adm -cmd status" on I2000, it will prompt: Service Agent: snmp_agent [1 service(s)] SnmpAgent [not running ] And customers may be sensitive to this, they may think this is a problem. So we need to screen this process, do not start the process. Version: I2000V300R001.2D308

Alarm Information:

Service Agent: snmp_agent [1 service(s)] SnmpAgent [not running ]

Cause Analysis:

Null.

Handling Process:

1.Modify file:/opt/huawei/I2000/etc/conf/sacsvc.xml. Comment-out the following 3 parts: (1) <!--svcagent name="snmp_agent"> <param name="port">51011</param> <service name="SnmpAgent"> <param name="agent_port">4700</param> <param name="svctype">SnmpAgent</param> <param name="dllvalue">snmpagent_impl</param>

79

<param name="dependency">FaultService</param> <param name="group">Kernel Services</param> </service> </svcagent--> (2) (3) 2. Execute the command:svc_adm -cmd reload 3. Check the status again, the snmpagent process is not started, and the running information of it is not showed in the result.


Null.

8.5 I2000 can not add new NE after deleted the old NE(the same type)

Title: I2000 can not add new NE after deleted the old NE(the same type)

ID: SE0000362220

Update time: 2008-12-31 14:42:42

Author: gaoyantao

Product Family:


Fault Type: I2000

Keywords: I2000 delete NE

Digest:


We deleted the WISG NE from I2000 LMT, after that we want to add it again, but we faced error with "IP and ports has existed"

Alarm Information:

We deleted the WISG NE from I2000 LMT, after that we want to add it again, but we faced error with "IP and ports has existed"

Cause Analysis:

we checked all of the parameters, and didn't find error, also we can not see the WISG NE from the LMT window. Then we checked the database table, also didn't find information about WISG,so can not delete from the database. We added it again and opened trace to analyze it: trace_adm -cmd settrace -level debug -tag all -svcagent oam_com_ts_agent trace_adm -cmd settrace -level debug -tag all -svcagent med1_agent From the logs, we can see there is still information in the SNMPv3Protocol table, that is the reason which stopped to add this NE

Handling Process:

Delete the SNMPv3Protocol MO from this table, then we can add it successfully


Null



80

8.6 NorthPerf service has stop running

Title: NorthPerf service has stop running

ID: SE0000334101

Update time: 2008-07-03 11:13:34

Author: 00702705Henry Sewordor

Product Family:


Fault Type: I2000

Keywords:

Digest:


NorthPerf service has stop running: Service Agent: mml_agent [1 service(s)] NorthPerf [not running ] The NorthPerf service has been running for two weeks and stops running on 29-05-2008.I2000 server version: I2000 V300R001.2Dh04. The Northbound interface stopped receiving performance data.

Alarm Information:

No alarm But the Performance statistic tasks data created were no longer generated at the northbound interface( in the dierctory /opt/huawei/I2000/var/northshare/nmsuser)

Cause Analysis:

1. The NorthPerf service is controlled by iMAP.mml_agent. When I checked the iMAP.mml_aagent.trace log file, I observerd that there was an error in creating or locating the "$IMAP_ROOT/var/northshare/nmsuser!". 2. I observed that the perfdb log database space was full 3. I queried the perfdb database and realised that the performance data files exist These mean that the performace statistics data files are present but are not sent to the northbound interface(/opt/huawei/I2000/var/northshare/nmsuser). This problem occurred because the NorthPerf service was not running. The NorthPerf service stopped running because the perfdb log database space was full and the iMAP.mml_agent failed to obtain the environmental variable $IMAP_ROOT defined in the PerfNbiGlobal.xml configuration file.

Handling Process:

1. I changed the "<PerfDataRoot>$IMAP_ROOT/var/northshare/nmsuser</PerfDataRoot>" parameter in the PerfNbiGlobal.xml configuration file to /opt/huawei/I2000/var/northshare/nmsuser. This parameter now has absolute path as: "<PerfDataRoot>/opt/huawei/I2000/var/northshare/nmsuser</PerfDataRoot>" 2. Expanded perfdb log database. Command: expandb -Dperfdb_default -S102400 -Tlog -Pdefault. 3. Then start NorthPerf Service. Command: svc_adm –cmd startsvc NorthPerf 4. NorthPerf service is now running and Northbound performance data files are also generated.

Suggestions and

Null.



81

summary:

8.7 I2000 Current Fault Alarm Browser is not refreshing

Title: I2000 Current Fault Alarm Browser is not refreshing

ID: SE0000330274

Update time: 2008-05-31 17:14:22

Author: s75730

Product Family:


Fault Type: I2000

Keywords: Fault Alarm Browser

Digest:


I2000 (V300R001.2D8) Client was installed in a Windows XP machine. It was found that I2000 Client was unable to refresh current fault alarm brower automatically. To see the current alarm user has to press "Apply Again" button.

Alarm Information:

Null.

Cause Analysis:

When the was found, the same software was installed in another machine with the same configuration as the problem ones and there it was working well. While analyzing the problem it was found that the previous client machine has two NIC card serving two different network.

Handling Process:

To solve this problem, first disable the other NIC card which has no relation with the I2000 working network and start I2000 client. The client was working fine and the current fault alarm browser was refreshing automatically. Later, again enable the NIC card which was disabled earlier and the same problem arised.


Do not install I2000 client in such a machine (Windows platform) which has two NIC card serving two different networks.

8.8 Which is the difference between SNMP and MML signaling in I2000?

Title: Which is the difference between SNMP and MML signaling in I2000?

ID: SE0000316421

Update time: 2008-03-31 09:30:11

Author: 77963Carlos David Huaman Torjek

Product Family:


Fault Type: I2000

Keywords:

Digest:



82


We will begin discussing with customer about the I2000 SNMP signaling to their NMS Service. I want to send them the guide file I am attaching to this ticket. However, I do not understand something about the SNMP and the MML part. The MML part contains NE Elements that the SNMP part does not: USAU, MT Server, SMSC. What does this mean? That the SNMP part does not contain alarms referred to these elements? According to the manual, only SNMP signaling is used for Overseas countries. Not MML signaling. And customer is willing to use SNMP only. Please assist.

Alarm Information:

Null.

Cause Analysis:

I2000 just support SNMP protocol for NMS connection. Customer no need to know how to use MML command . MML is just a protocol for communication between I2000 and MML NE It is easy to mistake the Notth Alarm which Document said . In fact ,all the alarm in sheet MML and sheet SNMP will all upload to NMS ,using SNMP The alarm just be sorted into this two type NE .(MML /SNMP ) There two type protocol for connection between I2000 server to NE ——MML and SNMP ..For example : USAU, MT Server, SMSC , etc. ——these use MML protocol connecting I2000 . Others (WAPGW ,MMSC ,Host ) use SNMP . So there are two type of NE--- MML & SNMP ..

Handling Process:

The Northbound Alarm Static Information just separate the alarm group by the NEType .In fact ,all the alarm in SNMP /MML /I2000 sheet will transfer to NMS ,if the alarm-event appear.


Null.

8.9 Can not create NE MT server on I2000 client

Title: can not create NE MT server on I2000 client

ID: SE0000322535

Update time: 2008-03-31 18:59:33

Author: Nguyen Phu Nam

Product Family:


Fault Type: I2000

Keywords: NE,I2000,MT server

Digest:


when I created the NE MT server on I2000 client, it is displayed as "connecting to maintenance port failed" for port 6500.Although port 6400 and 6300 could be connected. I2000 Version Dh04(for Suse), MTserver version is De14(ATAE)



83

Alarm Information:

error: "connecting to maintenance port failed"

Cause Analysis:

This maybed caused by the network which not stable This maybe caused by the routing of router or forbidden of Firewall.

Handling Process:

Check the network connection from MTserver to I2000, it's very stable Check the routing and firewall forbidden for port 6500 from customer, they confirmed that it was opened for all connections from I2000 to MT server Check mtserver.ini via documents guide and it's OK Check the setting when create NE MT server on I2000 client we found that if we entered the password for user sa so I2000 can not connect to MT server and the error like above descriptions. So we set as below: subscriber user: sa subscriber PIN: ==> set this field to null so I2000 can connect to MT server without the error: "connecting to maintenance port failed"


Check all config files, documents, troubleshouting guide.

8.10 A fake CPU usage is showed by I2000

Title: A fake CPU usage is showed by I2000

ID: SE0000330389

Update time: 2008-06-16 08:37:59

Author: Christian Chavez Franco

Product Family:


Fault Type: I2000

Keywords: fake alarm cpu usage i2000

Digest:


On an RBT site, using the iManager I2000 V300R001.2Dh04 as management system. The I2000 client show an overloaded cpu usage which is not real.

Alarm Information:

A QOS alarm is send by I2000 thought the Northbound SNMP interface to the customer. In addition the permanent alarm showed in the I2000 client.

Cause Analysis:

1._Check the cpu usage by the command top -c -d1 in the cli. This shows that the cpu is not overloaded. 2._Check the system Monitor Browser in the I2000 client and this shows the CPU usage rate of 100%.

Handling Process:

1._Check the cpu usage in the server using top -c -d1 2._Collect the log iMAP.monitor_agent.trace in the folder $iMAP_ROOT/var/logs/ 3._All the information was sent to RND, which determinate that is a bug. 4._We replaced the ResourceMonitor file with a newone sent by RND and restart I2000 service



84


You need stop all I2000 service with the following commands using root user. # stop_svc # stop_daem You can see the attached document with the procedure

A&S Products Cases Chapter 9 USAU Cases


85

Chapter 9 USAU Cases

9.1 MTP Link failed because of peer end problem.

Title: MTP Link failed because of peer end problem.

ID: SE0000380561

Update time: 2009-03-31 02:28:29


Product Family: Service and Software Public Product: USAU&SAU

Fault Type: USAU

Keywords: link failed, SIOS

Digest:


In USAU alarm window, we frequently found the "'MTP LINK FAILED" error message and its recovered automitically. In details we found: "'MTP LINK FAILED" 'Module No.=41, LinkNo=0, ReasonID=15 'Link failed because received SIOS from the peer end. 'Link failed because received SIOS from the peer end ,the peer end send break link message. 'check the EPI board connection and clock system. USAU Version: 1.5D503

Alarm Information:

'MTP LINK FAILED" 'Module No.=41, LinkNo=0, ReasonID=15 'Link failed because received SIOS from the peer end. 'Link failed because received SIOS from the peer end ,the peer end send break link message.

Cause Analysis:

First we check the physical connection of this module and found OK. Alos in this module we have another Link to another STP that is OK. That means our module have no problem. Then we check the clock connection of USAU and found OK. %%LST CKICFG:CONFIRM=Y;%% RETCODE = 0 Operation succeeded CKI Config Information ---------------------- ClockType = BITS1 E1 .&. BITS2 E1 ClockLevel = Level3 Work Mode of the Clk = Automatic Source Clock = BITS source clock 1 BITS1 Clock Priority = Not configed BITS2 Clock Priority = Not configed Line Clock1 Priority = Level1 Line Clock2 Priority = Level2 Then use LST BOSRC command to check clock source: %%LST BOSRC: FN=0, SN=2,CONFIRM=Y;%% RETCODE = 0 Operation succeeded EPI Board Output Clock Reference Config Information ---------------------------------------------------



86

Frame number = 0 Slot number = 2 E1 number = 0 --- END %%DSP BRD: FN=0, SN=13, PSN=BB,CONFIRM=Y;%% RETCODE = 0 Operation succeeded Board Information ----------------- Board type = WCKI Board status = Standby OK Work mode = Auto SRAM Status = Normal DDS Status = Normal Multiplier88915 Status = Normal Oscillator Chip Status = Normal H.110 Chip Status = Normal E1 Chip-1 Status = Normal E1 Chip-2 Status = Normal Current Reference Source = LINE1 Clock Source Reference Source State = Normal Clock Level = LEVEL 3 BITS1 CLock Priority = Not Configured BITS2 CLock Priority = Not Configured LINE1 CLock Priority = LEVEL 1 LINE2 Clock Priority = LEVEL 2 -- END Finally, we used self loop of this E1 port and wait 2/3 hours to check. But we didnt found any error message in this link. So, problem is not in our side. So we informed peer end(transmission/STP) engineer to check. after checking, transmission engineer confirmed that transmission is OK. So, need to check STP side. Finally STP confirmed the problem in their side.

Handling Process:

STP change their side and problem has been solved. STP engineer informed us, when they configured 4 links in one board, this link failed problem happend. But when they move one link from that board, problem has been solved.

Suggestions and summary: We should confirm that there is no problem in USAU side.

9.2 EPI E1 local lose synchronization alarm in USAU alarm window

Title: EPI E1 local lose synchronization alarm in USAU alarm window

ID: SE0000357063

Update time: 2008-11-27 10:08:14


Product Family: Service and Software Public Product: USAU&SAU

Fault Type: USAU

Keywords: no clock source

Digest:


USAU was showing continious "EPI E1 local lose synchronization " alarm in alarm window. In detail: Check the connection and clock system.



87

Alarm Information: EPI E1 local lose synchronization.

Cause Analysis:

First check clock configuration useing LST CKICFG command and found no problem: %%LST CKICFG:CONFIRM=Y;%% RETCODE = 0 Operation succeeded CKI Config Information ---------------------- ClockType = BITS1 E1 .&. BITS2 E1 ClockLevel = Level3 Work Mode of the Clk = Automatic Source Clock = BITS source clock 1 BITS1 Clock Priority = Not configed BITS2 Clock Priority = Not configed Line Clock1 Priority = Level1 Line Clock2 Priority = Level2 Then use LST BOSRC command to check clock source: %%LST BOSRC: FN=0, SN=2,CONFIRM=Y;%% RETCODE = 0 Operation succeeded EPI Board Output Clock Reference Config Information --------------------------------------------------- Frame number = 0 Slot number = 2 E1 number = 0 --- END Finally check the systus of clock board using DSP BRD command and found following problem: %%DSP BRD: FN=0, SN=15, PSN=BB,CONFIRM=Y;%% RETCODE = 0 Operation succeeded Board Information ----------------- Board type = WCKI Board status = Main OK Work mode = Auto SRAM Status = Normal DDS Status = Normal Multiplier88915 Status = Normal Oscillator Chip Status = Normal H.110 Chip Status = Normal E1 Chip-1 Status = Normal E1 Chip-2 Status = Normal Current Reference Source = No Clock Source Reference Source State = Normal Clock Level = LEVEL 3 BITS1 CLock Priority = Not Configured BITS2 CLock Priority = Not Configured LINE1 CLock Priority = LEVEL 3 LINE2 Clock Priority = LEVEL 3 --- END Here Current Reference Source = No Clock Source which is wrong. Thats why alarm is generatiog. we should find it. When we use ADD BOSRC command, we must select 'E1 number' which should be active. That means, if we select E1 number=0, E1 0 must be active. But in our case, we select E1 number = 0, but this E1 is not active. So we should modify it.

Handling Process:

First check which E1 is normal using DSP BRD command: %%DSP BRD: FN=0, SN=2, PSN=BB,CONFIRM=Y;%% RETCODE = 0 Operation succeeded Board Information ----------------- Board type = WEPI Board status = Normal



88

E1 port 0 state = Faulty E1 port 1 state = Normal E1 port 2 state = Normal E1 port 3 state = Normal E1 port 4 state = Normal E1 port 5 state = Normal E1 port 6 state = Faulty E1 port 7 state = Normal Balance Mode = No Balance --- END So, except E1_0 and E1_6 all E1s are active, we can use any one of them as clock source. So modify cloeck source E1 using following command: First remove: RMV BOSRC: FN=0, SN=2, EN=0; Then Add: ADD BOSRC: FN=0, SN=2, EN=1; Check modification: %%LST BOSRC: FN=0, SN=2,CONFIRM=Y;%% RETCODE = 0 Operation succeeded EPI Board Output Clock Reference Config Information --------------------------------------------------- Frame number = 0 Slot number = 2 E1 number = 1 --- END Also check clock board ststus: %%DSP BRD: FN=0, SN=13, PSN=BB,CONFIRM=Y;%% RETCODE = 0 Operation succeeded Board Information ----------------- Board type = WCKI Board status = Standby OK Work mode = Auto SRAM Status = Normal DDS Status = Normal Multiplier88915 Status = Normal Oscillator Chip Status = Normal H.110 Chip Status = Normal E1 Chip-1 Status = Normal E1 Chip-2 Status = Normal Current Reference Source = LINE1 Clock Source Reference Source State = Normal Clock Level = LEVEL 3 BITS1 CLock Priority = Not Configured BITS2 CLock Priority = Not Configured LINE1 CLock Priority = LEVEL 1 LINE2 Clock Priority = LEVEL 2 -- END


We must select active E1 as clock source.

9.3 USAU Sinaling remote syn lose

Title: USAU Sinaling remote syn lose.

ID: SE0000332291

Update time: 2008-06-11 08:10:01

Author: 85803Panyatat Tammarong

Product Service and Software Public Product: USAU&SAU



89

Family:

Fault Type: USAU

Keywords:

Digest: USAU Sinaling remote syn lose.


WCSU Board Version Information ------------------------------ PCB Version = WCSU VER 4 BIOS Level1 Version = SF3USPI4 112 BIOS Level2 Version = SoftX3000V100R001 Software Version = USAU V100R001.2D432 20040806 Logic Version = (U62) 0003

Alarm Information: USAU alarm description: EPI E1 local lost synchronization.

Cause Analysis:

Check connection and clock system: - check cloeck system cable and configuration. its OK - Check E1 loop at local DDF and end DDF. When loop E1 found E1 port on EPI board status normal. That mean E1 cable OK. -This project Using E1 cable 120 ohm -Check USAU EPI configuraion mode. configured suport balance mode= balanced(120ohm) - Check Dip-switch of EPI boards its much set be support 120 ohm E1 as well, found S1 ans S2 dip-switch wrong set up.

Handling Process:

How to set EPI dip-switch to support 120 ohm. The DIPs in the new boards are S1: PGND or NC ; S2: PGND or NC ; S3-S4: 75 or 120 ohm ; S5: 75 or 120 ohm. S1: dip-switch 1-8 set to NC S2: dis-switch 1-8 set to NC S3: dis-switch 1-8 set to 120 ohm S4: dip-switch 1-8 set to 120 ohm S5: 2 on, 1 off


None

9.4 Abnormal connection between MEM module and SMC because module parameter slip

window is open during SMC deployment

Title: Abnormal connection between MEM module and SMC because module parameter slip window is open during SMC deployment

ID: SE0000257157

Update Time: 2006-12-29 09:53:02

Author: wangjian Product Family: Service and Software Public Product: USAU&SAU

Fault Type: USAU

Keywords: slip window

Digest: Phenomenon Description:

MEM module cannot connect with SMC CTI Server in the SMC of site A.

Alarm Information:

MEM module cannot connect with SMC CTI Server



90

Cause Analysis:

The parameter Communication With Slip Window(0-unused,1-use) in the module connecting with SMC is 1. After the parameter is changed to 0, the connection between MEM and SMC is normal. Note: In USAU, this parameter determines whether the slip window protocol is used for USAU to communicate with upper layer users. For IN and HLR, the slip window protocol is used. However, this protocol is not used for SMC. The value 0 indicates the protocol is not used and 1 indicates the protocol is used.

Handling Process:

The parameter Communication With Slip Window(0-unused,1-use) in the module connecting with SMC is 1. After the parameter is changed to 0, the connection between MEM and SMC is normal. Note: In USAU, this parameter determines whether the slip window protocol is used for USAU to communicate with upper layer users. For IN and HLR, the slip window protocol is used. However, this protocol is not used for SMC. The value 0 indicates the protocol is not used and 1 indicates the protocol is used.


None.

9.5 USAU wcsu board fault alarm problem

Title: USAU wcsu board fault alarm problem ID: SE0000299799

Update Time: 2007-10-18 05:08:05

Author: Han Peng Product Family: Service and Software Public Product: USAU&SAU

Fault Type: USAU

Keywords: mem snoop usau wcsu

Digest:


From USAU alarm management system we alway find "MEM card fault" alarm, and then wcsu board switch. time for this alarm appear is no rule, some time every 30 minutes will be occur, some time every one hour,following is part of alarm information:

Alarm Information:

71036 MEM card fault Minor 2007-09-12 18:42:49 2007-09-12 18:42:51 2855 Communication 22 71037 MEM card fault Minor 2007-09-12 20:22:16 2007-09-12 20:22:18 2855 Communication 23 71038 MEM card fault Minor 2007-09-12 21:36:52 2007-09-12 21:36:56 2855 Communication 23 71039 MEM card fault Minor 2007-09-12 21:45:56 2007-09-12 21:45:58 2855 Communication 23 71040 MEM card fault Minor 2007-09-12 23:13:02 2007-09-12 23:13:04 2855 Communication 23 71041 MEM card fault Minor 2007-09-12 23:13:53 2007-09-12 23:13:55 2855 Communication 22 71042 MEM card fault Minor 2007-09-12 23:17:38 2007-09-12 23:17:40 2855 Communication 23

Cause Analysis:

When we meet this problem we can first think several possible reason: 1. network problem, for example some node in network send broadcast or lost IP package in network. 2. version problem, may be upper user version problem, when usau send heartbeat ,upper user can't response in time.

Handling Process:

we first check usau and IN version, both of them are correct and universal. so we can sure the problem is network problem, then we try to ping package from IN to USAU's wcsu board IP for long time,but there is no package lost.



91

after that we running network analysis tool "snoop" in SUN server to check if some node in network send broadcast, but the result is normal. Finally we begin to check USAU switch and IN switch configration. we find the port configration for usau swith which connect IN switch is config "interface Ethernet0/23 duplex full speed 100" but from IN side switch don't config "duplex full speed 100", It's default configration. so we also cancle the usau port configration to default value. After that we check for several days, the alarm was disappeared.


None.

A&S Products Cases Chapter 10 OS Cases


92

Chapter 10 OS Cases

10.1 How to Solve The problem "Windows out of lisence"

Title: How to Solve The problem "Windows out of lisence"

ID: SE0000365989

Update time: 2008-12-25 09:23:33

Author: 00716125Pham Xuan Tuan

Product Family:



Keywords:

Digest:


Windows Server 2000 generate error in Event View Application continuously: The product Windows Server is out of license

Alarm Information:

OS give error in Event View Application: The product Windows Server is out of license.

Cause Analysis:

When you add a new server use Window Server 2000 (or Windows server 2003) into Small Business Server network(SBS) without configuration more license, you will see License Service error in the application log. Because each SBS 2000 client access license (CAL) will authorizes you to access to any server running in the SBS computer, thus you dont config CAL in new server equal in number to the CALs that are installed in the network, you will see error Windows out of license.

Handling Process:

You have to configure CAL in a new server that add into network, follow steps: 1. Click Start - Administrative Tools - Licensing 2. On License menu click New License 3. In the Product box, click Windows Server 4. In the Quantity box, click to add CALs that equal in number to the CALs are installed on the SBS computer To determine how many CALs are installed in SBS 2000, click Start, click Small Business Server Administrator Console, and then click the About link on the Server Status (BackOffice Home) window. To determine how many CALs are installed on Windows Small Business Server 2003, click Start, click Server Management, click Licensing, and then view the Installed licenses value. 5. Click OK, then quit Licensing. For detail, check the attached document.


Null.



93

10.2 Command mstsc console not available in Windows XP SP3, Server 2008, Vista

Title: Command mstsc console not available in Windows XP SP3, Server 2008, Vista

ID: SE0000380071

Update time: 2009-03-11 17:25:44

Author: Amir Kadirov

Product Family:


Fault Type: OS/SERVER

Keywords: mstsc console Windows XP SP3, Server 2008, Vista

Digest:


Command mstsc /console in Windows XP SP3/Server 2008/Vista doesn't open the same session but open different sessions when you login to Windows Server 2003.

Alarm Information:

None.

Cause Analysis:

In Windows XP, Windows Server 2003, and earlier versions of the Windows operating system, all services run in the same session - Session 0. The Microsoft Windows XP SP3/Server 2008/Vista operating systems isolate services in Session 0 and making Session 0 non-interactive. Only system processes and services run in Session 0. .

Handling Process:

In this case if you have Windows XP SP3/Server 2008/Vista when you login to Windows Server 2003, use mstsc /admin instead of mstsc /console and it will not open different sessions.


If you use Windows XP SP3/Server 2008/Vista use command mstsc /admin to connect to the same session. Command mstsc /console in the Windows XP SP3/Server 2008/Vista doesn't exist any more.

10.3 a method to resolve password-lost problem(base on SUSE series OS)

Title: a method to resolve password-lost problem(base on SUSE series OS) ID: SE0000344383

Update Time: 2008-09-19 15:52:05

Author: zhangliang

Product Family:


Fault Type: OS/SERVER Keywords: password forgot SUSE 9 recovery

Digest:


If you've lost your password of SUSE9 , according to the cases I've seen,you should insert installation disk and choose "rescue mode" , then modify root's password in "single-user mode". But in SUSE9 , this back door is closed(the root' s password is needed before logging on "single-user mode". ) . Therefor, I will show you another way to deal with it .

Alarm Information:

None



94

Cause Analysis:

None

Handling Process:

1. edit the grub menu when you boot you system, add "init=/bin/bash" at the end of second row as following: root (hd0,0) kernel /vmlinuz-2.4.18-5.47 ro root=/dev/sda2 init=/bin/bash initrd /initrd-2.4.18-5.47.img 2、execute the following command after booting is finished: #mount -o remount,rw / #passwd root 3. set your new password #mount -o remount,ro / 3、reboot


None.

10.4 FAQ-How to take snapshot in UNIX

Title: FAQ-How to take snapshot in UNIX

ID: SE0000121676

Update Time: 2005-10-15 15:43:32

Author: Wang Jianwei

Product Family:

Access Network Management System Product: iManager N2000 BMS

Fault Type: Operation system

Keywords: password forgot SUSE 9 recovery

Digest:


How to take snapshot in UNIX

Alarm Information:

None

Cause Analysis:

None

Handling Process:

Right click on the desktop of Solaris, select "applications-snapshot" in the pop-up menu. In the displaying window, set the options and delay (generally, the snapshot type is "window" and the delay is 2 seconds) for the snapshot. Click the snapshot button and then click the needed window. After snapshot, click "save as" to save the picture as JPEG format or GIF format. Note:Do not save the picture as the default format of the operation system; otherwise, the picture may cannnot be opened on the PC.


None.

10.5 n2kuser User cannot Log in to SUN Operation System

Title: n2kuser User cannot Log in to SUN Operation System ID: SE0000282176

Update Time: 2007-07-10 14:46:26

Author: Hu Weinan



95

Product Family:

Data Network Management System Product: iManager N2000 DMS

Fault Type: Network Management Platform

Keywords: n2kuser

Digest:


The system is power off, and then it restarts. 2kuser user cannot log in to SUN, but root user can login.

Alarm Information:

When n2kuser logs, it prompts "unable to access home directory. Click OK to start a failsafe session, or cancel to restart login ."

Cause Analysis:

From the alarm information, after it is power off illegally, n2kuser directory is destroyed or lost. As a result, n2kuser user cannot log in to SUN.

Handling Process:

1. Use fsck -y to restore the system file. If n2kuser cannot enter SUN system, n2kuser directory is lost and cannot be restored. 2. Use root to log and then build n2kuser directory: mkdir /n2kuser 3. Copy three scripts under /n2kuser from the other SUN network management server of the same version that n2kuser user can login: .cshrc, .dtprofile, and .profile. These files are startup files of n2kuser. They are used to set the environment variables of n2000. 4. Upload the preceding three scripts to the directory of /n2kuser of the server. By default, the network management path is installed under /opt/n2000. If on-site network management is not installed under /opt/n2000, use text editing tool to change all contents of /opt/n2000 to installation directory of on-site network management, save them in the form of unix, and then put them under /n2kuser. 5. Check whether the user and the group that n2kuser belongs to are n2kuser and n2kgroup. If not, change them. 6. Quit the system and n2kuser user can log.


None.

10.6 Can't find hard disk when installing SUSE Linux on HP DL380G4

Title: Can't find hard disk when installing SUSE Linux on HP DL380G4 ID: SE0000117471

Update Time: 2005-10-09 15:47:43

Author: Lilei

Product Family:

Data Service Product: Data Service Total Solution

Fault Type: Keywords: SUSE linux hard disk

Digest:


The system prompt that system can't find hard disk when installing SUSE Linux on HP DL380G4.

Alarm Information:

The system prompt that system can't find hard disk when installing SUSE Linux on HP DL380G4.

Cause Analysis:

HP DL380G4 use Smart Array 6i, but SUSE Linux 8.0 installation software doesn't have the drive of this RAID card. So system can't find hard disk. We must load the drive of this card when installation.

Handling Process:

1. Download drive cpq_cciss-2.4.52-11.ul10.i586.dd from yf-ftp or HP Web-site. 2. Copy the application rawwrite from SUSE Linux installation CD1 to your PC. 3. Change cpq_cciss-2.4.52-11.ul10.i586.dd to cciss.dd, and put it and rawwrite under the same folder. 4. Put a formatted floppy disk into floppy drive. Execute rawwrite, input the drive



96

name cciss.dd, and input path A. The drive disk is ok after rawwrite completed. 5. Put CD of SUSE Linux SP3, restart PC. Press "alt" when installation menu appear. Then press enter, system prompt you to put floppy into floppy drive. Put it in and press enter. System load drive from floppy disk. Then system can find hard disk. 6. Follow steps of normal installation.


None.

10.7 Linux System Cannot Save Modified Time, and After Restart, the Clock Changes.

Title: Linux System Cannot Save Modified Time, and After Restart, the Clock Changes.ID: SE0000123884

Update Time: 2005-11-03 19:29:57

Author: Yang Zhiguo

Product Family:


Fault Type: OS/SERVER Keywords:

Digest:


Linux system cannot save modified time, and after restart, the clock changes.

Alarm Information:

None

Cause Analysis:

1, The default hardware clock adopted by linux is UTC clock, which will enable daylight saving time automatically. The daylight saving time has been cancelled in our country, so when we set the hardware clock and time zone by YAST2, we should set them to LOCAL clock and Shanghai time zone respectively. 2, In the /etc/rc.d/boot.clock script file, there is automatic time correction statement, so the system will update the time automatically when restarted. Modify corresponding statement and the fault can be removed. 3, When modifying the system time, we do not synchronize the clock. Use the command hwclock –systohc to solve this problem.

Handling Process:

Use the command date -s hh:mm:ss to modify the system clock. Use the command hwclock –systohc to synchronize the system clock to the hardware clock (Otherwise, the original wrong hardware clock will modify the system clock when the system is started.). The fault is removed and the cause is the system time does not synchronize the hardware clock.


None.

A&S Products Cases Chapter 11 DB Cases


97

Chapter 11 DB Cases

11.1 Oracle cluster on windows 2003

Title: Oracle cluster on windows 2003

ID: SE0000351119

Update time: 2008-10-10 14:52:55

Author: Mihai Voica

Product Family:



Keywords: oracle cluster windows 2003 oraclemscsservices

Digest:


*Installing oracle cluster on windows 2003 according to this guide: RBT – Oracle9i Database HA Mode Installation Guide (for ATAE Platform WIN2003+S3100)V2.0-20080515-B.zip * The install procedure involves some semi automated procedure, that will install oracle server and set up 2 node cluster on windows 2003 * The users inputs some basic info: ip address, admin/password of cluster, etc and then installation is done by some scripts.

Alarm Information:

After completing the installation of the oracle cluster on the windows 2003 operating system, the redundancy does not work. If node1 fails, the oracle cluster will not switch to node 2.

Cause Analysis:

When checking event-viewer it can bee seen the the cluster admin of oracle server is not allowed to login.

Handling Process:

Checking the admin user of oracle cluster it can be seen that is in the form of user@domain. It seems that this old format is sometimes not recognized by newer versions of Microsoft Windows 2003.


Change the domain user to the following format: domain/user instead of user@domain.

11.2 Rebooting linux cause create oracle database failed

Title: Rebooting linux cause create oracle database failed

ID: SE0000399829

Update time: 2009-07-01 09:50:06

Author: Ai Yibo/94209

Product Family:




98


Keywords: oracle create database raw reboot

Digest:


In S country new RBT site deployment,On-site engineer install a DB cluster.when create new database by dbca tools ,dbca proggram show a error window and installation failed. software environment OS:Suse 9 sp3 cluster :VCS 4.1.2 Appliaciton:oracle 9208

Alarm Information:

check alert log ,find error of control_files creating failed

Cause Analysis:

using command "dd if=/dev/raw/rawX of=/root/log.txt"(X is raw device No.) to check raw device one by one ,then find raw device is abnormal. Check some document of linux and find the information that command like "raw /dev/raw/raw1 /dev/vg_ora/ora_system" will be invalid after linux reboot.So the raw device is abnormal and bring the error of oracle installation..

Handling Process:

execute the raw commands(raw command is for raw divece binding.e.g "raw /dev/raw/raw1 /dev/vg_ora/ora_system") again .Then create the database by dbca successfully.


"raw" command(raw command is for raw divece binding.e.g "raw /dev/raw/raw1 /dev/vg_ora/ora_system") will be invalid after linux reboot.So pay attention to this when you create a database file in raw device.

11.3 Not automatic set environment variable causes that Oracle SqlPlus fails to start

Title: Not automatic set environment variable causes that Oracle SqlPlus fails to start

ID: SE0000396868

Update time: 2009-06-30 20:02:37

Author: Wilson Chaves w78135

Product Family:



Keywords: Oracle sqlplus Suse Linux ORACLE_HOME

Digest:


There is a problem when you try to start the Oracle Sqlplus application to connect to the database. When a Oracle database was intalled in Linux Suse 9.3 and you go the the folder: $ORACLE_HOME/app/oracle/product/version/database/bin and start the sqlplus using the following command: $ ./sqlplus Then you get a error message as follows:



99

Error 6 initializing SQL*Plus Message file spl<lang>.msb not found SP2-0750: You may need to set ORACLE_HOME to your Oracle software directory $ Unless the messages file is not corrupted, this common error message is related to some Oracle environment variables that should be set. This variables could be defined in the oracle user profile in order that automatically being set when the user logon.

Alarm Information:

When the sqlplus is run the SP2-0750 oracle error is displayed.

Cause Analysis:

This error appears because the sqlplus, requires the messages files and other environment variables as: ORACLE_HOME, ORACLE_BASE, ORACLE_SID, and include the $ORACLE_HOME variable in the PATH. However if this variables are not set in the user profile, every time that the sqlplus is run the environment variables should be set.

Handling Process:

In order to avoid this error message you must follow the following procedure: 1. Verify the default shell on SUSE Linux: echo $SHELL 2. Change the user as oracle (if this is the default database user): su - oracle 3. Define the following environment variables: export ORACLE_HOME=/u01/app/oracle/product/version/database/ (Oracle path) export ORACLE_BASE=/u01/app/oracle export ORACLE_SID=orcl export LD_LIBRARY_PATH=$ORACLE_HOME/bin export PATH=$ORACLE_HOME/bin:$PATH To have these environment variables set automatically each time you login as oracle, you can add these environment variables to the ~oracle/.profile file which is the user startup file for the Bash shell on SUSE Linux. To do this you could simply copy/paste the following commands to make these settings permanent for your oracle's Bash shell: su - oracle cat >> ~oracle/.profile << EOF export ORACLE_BASE=/u01/app/oracle export ORACLE_SID=orcl export LD_LIBRARY_PATH=$ORACLE_HOME/bin export PATH=$ORACLE_HOME/bin:$PATH EOF


Write this configuration in the ~oracle/.profile and the Oracle environment variables are automatically set.

11.4 how to solve Oracle lock issue

Title: how to solve Oracle lock issue

ID: SE0000398581

Update Time: 2009-06-30 08:45:01

Author: chenzuihong

Product Family:



Keywords: oracle,lock,CRBT

Digest:


In one oversea CRBT site,the CRBT service develops very fast.And the Oracle database data increases very fast.Sometimes,we find that the database runs slowly when we do some operation on the database.Sometimes,it is because of the lock issue.



100

Alarm Information:

Null

Cause Analysis:

Sometimes,the database runs slowly when we do some operation on the database because of the lock issue.We need to check the Oracle lock issue and solve it.

Handling Process:

(1)We use the below SQL commands to check the Oracle database locks: select object_id,session_id,locked_mode from v$locked_object; select t2.username,t2.sid,t2.serial#,t2.logon_time from v$locked_object t1,v$session t2 where t1.session_id=t2.sid order by t2.logon_time; (2)When we find that some records exists for a long time,it means that it is locked normally.(3)To unlock the abnormal and longtime lock,we use the below SQL: alter system kill session 'sid,serial#'; sid,serial# is based on the current Oracle. (4)Check the lock again.If it disappears,the lock problem is solved.


We need to know deeply about Oracle architecture,it can help us to maintenance the CRBT better. .

11.5 Oracle Undo Tablespace increasing Rapidly

Title: Oracle Undo Tablespace increasing Rapidly ID: SE0000380601

Update Time: 2009-03-18 11:14:50

Author: Amr Zein Abdelhady 00712852

Product Family:


Fault Type: Database Keywords: undo tablespace increase rapidly no managed connections

Digest:


In X CRBT site, the undo tablespace was increasing rapidly at high speeds, the undo tablespace at one week occupied 70% of 4 GB Size, then customer increased size to 8 GB but Undo tablespace still grew to 85% of the 8 GB and then customer increased size to 16 GB but then also occupied space was 90% of 16 GB means that there was something wrong.

Alarm Information:

Oracle kept generating errors "not enough space in undotablespace". In USDP Log you can see the error "no managed connections".

Cause Analysis:

Every action on a DB has something to do with undotablespace...For example when you delete 1000 records from a table, these 1000 records are held in the undotable space in case you need to rollback if you did not do a commit... If you commit it then it is again held in the undotable space as your undoretention time parameter which is 900 (15 minutes for our DB)... After 15mins the data is removed from the undotable space. So something in the DB was deleting records but not committing those records.

Handling Process:

after very hard check we found that the table t_userloginlog had over 100 million records and the job trying to delete these records were trying to delete based on the parameter log_preserve_time in the t_config table which was configured to 12 days so removing all those records then keeping two copies in the undo tablespace was causing the problem so we stopped the job deleting those records which was job 226 then we truncated the table t_userloginlog and restarted the job 226 then undo tablespace never increased beyond 1 GB which is the normal behaviour.


First the log_preserve_time parameter in the t_config table is described as the value is in month but we have confirmed with R&D India that this value is in days not months, attached is the confirmation email from R&D regarding this problem in the t_config table. Second check the log tables regularly to prevent this from happening in the first place.



101

Third configure the parameter log_preserve_time as minimum as possible but based on customer recommendations.

11.6 SQL 2000 Server JDBC Driver Error by a bad installation of SQL Server

Title: SQL 2000 Server JDBC Driver Error by a bad installation of SQL Server

ID: SE0000165315

Update time: 2007-09-25 14:11:40

Author: Rodrigo Pichardo

Product Family:

Access Network Management System

Product: iManager N2000 BMS


Keywords: SQL Driver JDBC Problem

Digest:


Following the manual for Install the N2000 specifically in the "Database Server Config" step and after you put all the information required and created the Alias (using the SQL Server Client Network Utility) you receive a "DB Connection Error".

Alarm Information:

DB Connection Error means: DBMS os offline The Server Name is error The Password of Super DB User is error The Password of NMS DB User is error The NMS Database User does not exist and no database will be installed this time.

Cause Analysis:

Reviewing the "installdisk.log" you can see the next error message: "java.sql.SQLException: [Microsoft] [SQLServer 2000 Driver for JDBC] Character set 437 not found in com.microsoft.util.transliteration.properties." This is because the installation sofware don't found the JDBC libraries to use for connect to database, so you must to re-install the SQL Server first.

Handling Process:

If the customer CANNOT reinstall the SQL Server he can download the "SQLServer 2000 JDBC Driver" from Microsoft Download Site (JDBC Driver SP3 is the last one). After load and install the Driver (See Installation Guide document) you must copy the next files: $MS_SQLServer_2000_Driver_For_JDBC_InstDir/lib/msbase.jar $MS_SQLServer_2000_Driver_For_JDBC_InstDir/lib/mssqlserver.jar $MS_SQLServer_2000_Driver_For_JDBC_InstDir/lib/msutil.jar Into the "lib" directory inside of the source directory of the NMS installation software, by default "windows" or "win_sql". You'll receive a warning message about overwrite it, click in OK. After that you can start the NMS installation again with out problems.


If after follow the steps above you still receive the "DB Connection Error", maybe you have to create the N2000user using the "Enterprise Manager" with the same permisions like "sa" user. And try the installation again.

application and software products cases

Documents