Xymon Mailing List Archive search

Troubleshooting Purple CONN and HTTP Tests in Xymon 4.3.10

list Larry Barber
Tue, 6 Nov 2012 08:47:55 -0600
Message-Id: <CAOnF4RAWH-KQDEzzUwhH77fLyLkx2+Hq4=user-18d022329e49@xymon.invalid>

Did you check to see if a xymonnet process is/was still running? If a
process gets hung for some reason xymonlaunch won't start a new process. I
had this happen to me once, but only once. There is also a --debug flag for
xymonnet, but it produces a _lot_ of output, but it might give you some
idea what is going on.

Thanks,
Larry Barber

On Tue, Nov 6, 2012 at 8:02 AM, Don Kuhlman <user-5eb2bfadc6c6@xymon.invalid> wrote:
 Thanks Larry. Looks like everything went purple again at 6:45 this
morning.  The logs still show 0 bytes.
Any other suggestions for trying to figure this out?

 Regards,

 Don

  From: Larry Barber <user-6ef9c2864140@xymon.invalid>
Date: Mon, 5 Nov 2012 17:19:53 -0600
To: Don Kuhlman <user-5eb2bfadc6c6@xymon.invalid>

Subject: Re: [Xymon] FW: Troubleshooting Purple CONN and HTTP Tests in
Xymon 4.3.10

 Xymonnet tends to be pretty quiet unless something goes wrong. You won't
be able to tell for sure until you get one of your purple storms.

 Alerts are handled by a different module. Look in tasks.cfg to find it.

 Thanks,
Larry Barber

On Mon, Nov 5, 2012 at 3:53 PM, Don Kuhlman <user-5eb2bfadc6c6@xymon.invalid>wrote:
 Hi Larry/all.  I've noticed that the xymonnet.log and
xymonnet-again.log files are staying at 0 bytes.  Does that seem to be
indicating a problem?
(and Xymon hasn't gone purple all day, but I'm still not sending any
email alerts to anyone).

 -rw-rw-rw- 1 xymon xymon        0 Nov  5 15:05
/var/log/xymon/xymonnet-again.log
-rw-rw-rw- 1 xymon xymon        0 Nov  5 15:07 /var/log/xymon/xymonnet.log

 Thanks

 Don K


  From: Larry Barber <user-6ef9c2864140@xymon.invalid>
Date: Mon, 5 Nov 2012 11:19:32 -0600
To: Don Kuhlman <user-5eb2bfadc6c6@xymon.invalid>
 Cc: Xymon Email List <xymon at xymon.com>
Subject: Re: [Xymon] FW: Troubleshooting Purple CONN and HTTP Tests in
Xymon 4.3.10

 All the server side Xymon logs are in /var/log/xymon by default. Since
you say that you are getting purple storms for conn and http tests, this
suggests that the problem is likely with your xymonnet process. Check the
xymonnet log, and when you see the purples check to see if there is a
xymonnet instance running. If this instance has been running for more than
a few minutes, kill it. If the xymonnet process is hanging, you might want
to set the MAXTIME parameter on the xymonnet process in tasks.cfg. Doesn't
really fix the problem, but it will at least stop things from going
purple.

 Thanks,
Larry Barber

On Mon, Nov 5, 2012 at 10:01 AM, Don Kuhlman <user-5eb2bfadc6c6@xymon.invalid>wrote:
 Update to this. While googling further, I saw a thread titled
"[hobbit] stale alerts".  This mentioned that there could be an external
script that I created which may cause issues for xymon when it runs.  I do
have a diskstat.sh script that may be causing problems. For now, I'm
setting it to DISABLED in the tasks.cfg file.

 Is there a way to see log information in xymon to try and verify
something like this?

 Thanks

 Don K

  From: Don Kuhlman <user-5eb2bfadc6c6@xymon.invalid>
Date: Mon, 5 Nov 2012 08:34:29 -0600
To: Xymon Email List <xymon at xymon.com>
Subject: Troubleshooting Purple CONN and HTTP Tests in Xymon 4.3.10

  Hi folks.  We've been running xymon for about 10 months now. It's
been fine all this time.

 However last week around Wednesday we started getting purple storms on
the CONN and HTTP tests for all our hosts.
I stop Xymon and restart it, or reboot the server (Linux 5.x) and then
it comes back ok.
This also happened Thursday, and then again Saturday around 2PM cst.

 Anyone have a link or source for which logs to look in on the server
or xymon to see what may be causing the CONN and HTTP tests to randomly
start failing like this or where to start troubleshooting?

 Can I use xymonlaunch —debug like this to see what is happening?
         /usr/lib64/xymon/server/bin/xymonlaunch --debug
--config=/usr/lib64/xymon/server/etc/tasks.cfg
--env=/usr/lib64/etc/xymonserver.cfg


 While searching the xymon forum and message boards, I saw some things
that say it may be disk space or inodes, but it seems like we are ok there -
 df -i
Filesystem            Inodes   IUsed   IFree IUse% Mounted on
/dev/sda2            3899392  204731 3694661    6% /
tmpfs                 490139       6  490133    1% /dev/shm
/dev/sda1              32768      51   32717    1% /boot

 df
 Filesystem           1K-blocks      Used Available Use% Mounted on
/dev/sda2             61312028   5748784  52448700  10% /
tmpfs                  1960556       188   1960368   1% /dev/shm
/dev/sda1               516040     87716    402112  18% /boot

 DNS also seems fine.

 Thanks

 Don K