Xymon Mailing List Archive search

Alert Rules - DURATION not working

list Henrik Størner
Wed, 2 Feb 2005 22:10:11 +0100
Message-Id: <user-198f36f13b64@xymon.invalid>

On Wed, Feb 02, 2005 at 08:56:22AM -0500, Tom Georgoulias wrote:
HOST=$FOUND_SYS
        MAIL user-20904209b1a6@xymon.invalid SERVICE=procs COLOR=red DURATION>5 
REPEAT=5

After I add this rule, I restart hobbit.  I read on the list that 
restarting isn't necessary, but it has been my experience that changes 
made to hobbit-alerts.cfg do not always get put into effect unless 
hobbit is restarted.
It shouldn't be needed, but it doesn't harm.
2005-02-02 08:11:12 criteriamatch foundry01.nandomedia.com:procs 
(NULL):(NULL):procs
2005-02-02 08:11:12 failed minduration 0<300
OK
2005-02-02 08:16:12 Got page message from foundry01.nandomedia.com:procs
2005-02-02 08:16:12 0 alerts to go
And this looks suspicious.

What's supposed to happen is that after the alert is first reported to
the hobbitd_alert module, this module is supposed to keep track of
when the next alert is due (the REPEAT interval comes into play here),
and if no alerts are due then you get the "0 alerts to go" message.

So something messes up the timekeeping, and we never get around to
testing if the DURATION triggers after the first attempt.

[after looking over the code for 10 minutes]

I think I've got it, but there's been quite a few changes to various
bits so I dont want to send one-line fixes now. I'll come up with a
proper full package, which will also include fixes for many of the
other bugs that have been reported for beta6.


Henrik