This describes how the NOC can determine whether a queue is backed up due to a network problem or because the customer has not logged in to download their mail.
- Checking Openview to see if there are any alarms indicating that a customer's queue is backed up, meaning the account has messages over at least 2 hours old. This alarm is usually indicated by a sync fep (where this type of alarm would occur) followed by the account and directory.
- Ack 18:59 12-10/2001 NM: /usr/ordnet/sendq has old files [GMS:AZZU]
- Once an alarm for a backed up queue has been issued, you need to log onto the sync fep where the account resides on (e.g. in the above example, the sync fep would be AZZU).
- Examine the directory the alarm is coming from
azzu_root > cd /usr/ordnet/sendq
azzu_root > ls -al
total 2722
drwxrwxrwx 2 genet other 464 Dec 10 19:23 .
drwxrwxrwx 16 genet other 976 Dec 10 19:50 ..
-rw-r--r-- 2 root other 1378517 Dec 10 19:13 b-3c150947
-rw-r--r-- 1 root other 91 Dec 10 19:13 b-3c150947.TAG
-rw-r--r-- 2 root other 65 Dec 10 19:27 b-3c150947.h
-rw-r--r-- 2 root other 18 Dec 10 19:27 b-3c150947.t
-rw-r--r-- 1 root other 37 Dec 10 19:23 lsend
- Check the date of the oldest message. This should be newer than the time listed for the lsend file (the lsend file should indicate the last time the network sent something to this account). Keep this in mind when continuing to troubleshoot the alarm in the steps below.
- See if we initiate to the account
azzu_root > grep INI /usr/ordnet/config
INITIATE=no
note that in real life, ordnet is INI=yes ... this is just for example
- We don't initiate calls, so the backup is not a problem. For the 15% that does not follow this theory, continue on...
- See if there's a dedicated line
azzu_root > grep DED /usr/ordnet/config
DEDICATED_PORT=no
- This account does not have a dedicated line. This will assist you in troubleshooting in later steps below. If there was a dedicated line, it would be listed where no is above.
- After running the two commands above, you can run the following commands to see if the account in question currently has an active session (either sending inbound or outbound)
- ps -ef |grep <account> will list any active sessions for the account and which line they are on
root 13270 1 0 18:23:38 ? 0:07 sh /usr/attmail/bin/hiisc5.0 DIAL gbfunb
- ls -al /usr/*/ACT* will list all accounts on the box that are currently active
- If both commands used above show an active session, but the creation date is old, this might indicate the line the account came in on is hung and needs to be stopped and restarted (very frequent with genet on AZZU).
- If there is no active session
- resetting customer's dedicated line, if they have one (see Repumping a line)
- hireschedule accountname (answer no to the formtypes question)
- From the results of step 9, the date of the lsend file in step #4, and the date and size of the oldest file in the sendq, you should be able to determine if the line needs to be reset or not. Pay close attention to the size of the file. Large files (usually close to or over 1M), depending upon which fep you are working on, may take longer to process than others. If this is the case, proceed verify there is no activity for the account.
- If the account is set to INITIATE=yes and you perform the hireschedule command but are still unable to connect use himonitor see if the account is currently sending inbound. Most outbound sessions will be deferred until the inbound session has ended.
azzu_root > himonitor
Synchronous Gateway service status:
Process Status:
obndmgr process is active
hischeduler process is active
ack_nak process is active
diskmon process is active
logdemon process is active
nmdaemon process is active
Synchronous ports:
line10.0: 3780 DIAL OUT PORT, EMULATOR STARTED
line10.1: 3780 DIAL OUT PORT, NOT RUNNING
line20.0: 3780 DIAL OUT PORT, NOT RUNNING
line20.1: 3780 DIAL IN PORT, EMULATOR STARTED
line30.0: 3780 DIAL IN PORT, NOT RUNNING
line30.1: 3780 DIAL IN PORT, NOT RUNNING
line40.0: 3780 DIAL IN PORT, EMULATOR STARTED
line40.1: 3780 DIAL IN PORT, NOT RUNNING
line50.0: 3780 DIAL IN PORT, EMULATOR STARTED
line50.1: 3780 DIAL IN PORT, EMULATOR STARTED
line60.0: 3780 DIAL IN PORT, EMULATOR STARTED
line60.1: 3780 PORT TURNED OFF, NOT RUNNING
lineR.0: RESEND INBOUND PSEUDO-PORT, DIAL OUT PORT, EMULATOR STARTED, no hostagent
- /usr/attmail/logs/log.current is an overall log (found on all sync feps) for the entire sync fep listing all activity/accounts coming inbound or outbound to the box. (pipe it to grep account)
- /usr/lineNN.N/logs/currentRJE is the per-line log