Xymon Mailing List Archive search

unexpected green mails

list Robin Wood
Mon, 7 Mar 2005 22:35:04 +0000
Message-Id: <user-a3a29df34f11@xymon.invalid>

RC5 is unfortunatly still causing random "x stopped reporting" errors.
I just got 20 mails similar to this one:

Subject: 	BB [431703] mydomain.int:imap stopped reporting to BB
Date: 	Mon,  7 Mar 2005 22:02:52 +0000 (GMT)

green  Mon Mar  7 21:32:41 2005 imap ok 

Service imap on mydomain.int is OK (up)


* OK [CAPABILITY IMAP4rev1 UIDPLUS CHILDREN NAMESPACE
THREAD=ORDEREDSUBJECT THREAD=REFERENCES SORT QUOTA IDLE ACL ACL2=UNION
STARTTLS] Courier-IMAP ready. Copyright 1998-2004 Double Precision,
Inc.  See COPYING for distribution information.
* BYE Courier-IMAP server shutting down
ABC123 OK LOGOUT completed


Seconds: 0.01


This is for the IMAP server on the same box as the monitor so there
could be no network or connection issues. Anyone any ideas of anything
else to try?

A good side is that it is happening less frequently.

Robin


On Sat, 5 Mar 2005 00:00:30 +0000, Robin Wood <user-a977a67e95c8@xymon.invalid> wrote:
One other thing I did think of is that I set my monitor period to be
30 mins, could that have anything to do with it, something to do with
the time to live and the refresh period being the same?


On Fri, 4 Mar 2005 23:59:22 +0000, Robin Wood <user-a977a67e95c8@xymon.invalid> wrote:
On Thu, 3 Mar 2005 23:15:41 +0100, Henrik Stoerner <user-ce4a2c883f75@xymon.invalid> wrote:
On Thu, Mar 03, 2005 at 08:19:38PM +0000, Robin Wood wrote:
I've just put rc4 on so I'll see if anything does get fixed
Do pickup the post-RC4 patch, it has the final fix for the green
mails. http://www.hswn.dk/beta/post-RC4.patch
two questions though, first why would things stop reporting?
Most common cause: The server was rebooted, and the client was not
setup to restart automatically after a boot.
None of the boxes get rebooted, they are all live servers running
24/7, two with ISPs and one my own which I know the uptime of.
I'm monitoring 3 different boxes, one local, 2 remote on different
hosts, what constitutes "stopping reporting"?
A status in Hobbit (and BB) has a lifetime - default is 30 minutes.
Normally a status is refreshed every 5 minutes, so it stays "alive".
If Hobbit sees that a status has not been updated for so long that its
lifetime has been exceeded, it goes into the "stopped reporting"
(purple) state.
I've never seen anything actually go purple when the mails were sent
out but I don't watch it all the time so it could have done.
The other is why are some of my green entries smilies and others
 diamonds?
Smilies mean the color has changed within the past 24 hours.
ok sounds reasonable that if it sends out the mails then it is because
it thinks the status has changed.

I was going to report that RC4 had fixed it as I'd had no mails but
then I got this:

 - Program crashed

Fatal signal caught!

on the hobbit-alert monitor so I guess that may be why I hadn't got any.

I'll put the other patch on and see what happens.

Robin
Henrik