Xymon Mailing List Archive search

DURATION explanation in hobbit-alerts.cfg

2 messages in this thread

list Taylor Lewick · Fri, 11 Apr 2008 13:16:57 -0500 ·
Okay, I've read the hobbit-alerts man page and docs.

 
Under Duration the man page says: Rule matching an alert if the event
has lasted longer/shorter than the given duration.  And then it goes on
to give examples and those are in minutes.

 
But for the ping test for a host being down, I only want to send the
alert if a host is unresponsive for at least 1 minute (trying to
eliminate a bunch of false positives)  So I set it up for a single host
with the following rule...

 
HOST=test103

MAIL xyz DURATION>1 REPEAT=10 COLOR=red RECOVERED

 
So based on the alerts I've received (see below), Duration isn't per
minute.  So is there a way to set the alerts so I can specify don't send
an alert unless its 1 minute.

 
Service conn on test103 is not OK : Host does not respond to ping

System unreachable for 2 poll periods (19 seconds)

&red test103 is unreachable
list Greg L Hubbard · Fri, 11 Apr 2008 13:40:55 -0500 ·
Actually, I think the alert model is based on the "state" that a
"service" is in.  So what you might alert on is that the "conn" service
has been "red" for at least one minute.  This may or may not be very
effective, given the built in functions in the Hobbit ping testing.  If
you want to know that something has failed two successive poll cycles,
and the poll cycle is 5 minutes, you could set DURATION to > 6.
 
GLH
quoted from Taylor Lewick


	From: Taylor Lewick [mailto:user-ccbabb0b3ab0@xymon.invalid] 
	Sent: Friday, April 11, 2008 1:17 PM
	To: user-ae9b8668bcde@xymon.invalid
	Subject: [hobbit] DURATION explanation in hobbit-alerts.cfg
	
	
	Okay, I've read the hobbit-alerts man page and docs.

	 
	Under Duration the man page says: Rule matching an alert if the
event has lasted longer/shorter than the given duration.  And then it
goes on to give examples and those are in minutes.

	 
	But for the ping test for a host being down, I only want to send
the alert if a host is unresponsive for at least 1 minute (trying to
eliminate a bunch of false positives)  So I set it up for a single host
with the following rule...

	 
	HOST=test103

	MAIL xyz DURATION>1 REPEAT=10 COLOR=red RECOVERED

	 
	So based on the alerts I've received (see below), Duration isn't
per minute.  So is there a way to set the alerts so I can specify don't
send an alert unless its 1 minute.

	 
	Service conn on test103 is not OK : Host does not respond to
ping

	System unreachable for 2 poll periods (19 seconds)

	&red test103 is unreachable