Xymon Mailing List Archive search

recovered messages/SMS missing

list Dominique Frise
Tue, 30 May 2006 16:32:30 +0200
Message-Id: <user-64021249ba97@xymon.invalid>

Henrik Stoerner wrote:
On Mon, May 22, 2006 at 11:16:00AM +0200, Dominique Frise wrote:
We use the RECOVERED keyword for all recipients defined in hobbit-alerts.cfg.

We noticed a problem for hosts where alerting for a given service is excluded during a certain time. When a problem occurs on the service -out of the exclusion time-, the yellow/red alarms get sent. When the problem is resolved though, there is no recovered confirmation message/SMS. This issue is not related to the amount of time the service was down.


Example configuration and logs:

----hobbit-alerts.cfg----
...
...
# Do not send anything for given service(s) during period of time
HOST=test3 SERVICE=http TIME=*:0305:0315
...
...
# Rules by administrator
HOST=test3
MAIL user-5a72e5dcda3f@xymon.invalid REPEAT=24h RECOVERED
SCRIPT /usr/local/sendsms 0123456789 COLOR=red FORMAT=SMS REPEAT=24h RECOVERED

If I understand your configuration snippet correctly, then this is a configuration error. You shouldn't have rules with no recipients, like the first one you have shown here.

Is this a bug or a is something wrong with the exclusion specification?

Your exclusion is wrong. It should be (notice the TIME setting):

HOST=test3 TIME=*:0315:0305
   MAIL user-5a72e5dcda3f@xymon.invalid REPEAT=24h RECOVERED
   SCRIPT /usr/local/sendsms 0123456789 COLOR=red FORMAT=SMS REPEAT=24h RECOVERED


Regards,
Henrik

Thank you fo these explanations.

That means it is not possible to write simple rules for excluding alerts for a given service for all hosts (HOST=*) during a period of time?
Do we really have to write the same exclude/include rules for each host?


Dominique
UNIL - University of Lausanne