Xymon Mailing List Archive search

Alerts color filtering

2 messages in this thread

list Daniel Hartmeier · Tue, 24 Jul 2007 16:34:38 +0200 ·
How do you configure hobbit-alerts.cfg so a specific recipient only gets
notified on changes from/to color red?

For example, I added

  HOST=*
    SCRIPT /path/alert.sh recipient COLOR=red,yellow REPEAT=4h
RECOVERED

where alert.sh appends a line to alert.log for each invokation:

  #!/bin/sh
  echo "$BBHOSTSVCCOMMAS $BBCOLORLEVEL ACKCODE=$ACKCODE
DOWNSECS=$DOWNSECS RECOVERED=$RECOVERED RCPT=$RCPT" >>/path/alerts.log

I manually simulate a sequence green -> yellow -> red -> yellow ->
green on host.service with

  bb 127.0.0.1 "status host.service yellow `date`"
  bb 127.0.0.1 "status host.service red `date`"
  bb 127.0.0.1 "status host.service yellow `date`"
  bb 127.0.0.1 "status host.service green `date`"

and in the log I see

  host.service yellow ACKCODE=282431 DOWNSECS=0 RECOVERED=0
RCPT=recipient
  host.service red ACKCODE=282431 DOWNSECS=13 RECOVERED=0
RCPT=recipient
  host.service yellow ACKCODE=-1 DOWNSECS=42 RECOVERED=1
RCPT=recipient

Note that there is no invokation for the red -> yellow change, and that
the yellow -> green change uses 'yellow' as current color.

Now, if I add

  HOST=*
    MAIL mail.addr REPEAT=4h COLOR=red RECOVERED

and simulate the same sequence, I only get a mail about

  Subject: Hobbit [282431] host:service CRITICAL (RED)

and there is NO mail about the service recovering.

I assume that's because the color changed from red -> yellow -> green,
and that

  red -> yellow : does not match the RECOVERED criterion
  yellow -> green : does not match the COLOR=red criterion

Is there any way to get a recovery mail in this scenario, WITHOUT
changing the rule to COLOR=red,yellow?
I don't want to spam the recipient with incidents which never reach
color red (i.e. green -> yellow -> green).
For this recipient, the incident is actually considered 'recovered' as
soon as the color is no longer red...

Or how do you explain to your recipient (who only wishes to see
'serious' incidents), that they can't rely on getting a 'recovered' mail
in all cases?

Daniel
list Daniel Hartmeier · Thu, 26 Jul 2007 22:54:24 +0200 ·
To resolve the issue I'd like to suggest the following patch:

--- hobbitd/hobbitd_alert.c.orig	Thu Jul 26 20:38:33 2007
+++ hobbitd/hobbitd_alert.c	Thu Jul 26 20:39:12 2007
@@ -636,9 +636,10 @@
 				/* 
 				 * Send one "recovered" message out now, then go to A_DEAD.
 				 * Dont update the color here - we want recoveries to go out 
-				 * only if the alert color triggered an alert
+				 * only if the alert maxcolor triggered an alert
 				 */
 				awalk->state = A_RECOVERED;
+				awalk->color = awalk->maxcolor;
 			}
 
 			if (oldalertstatus != newalertstatus) {

This makes sure that when filtering recovery messages, we don't compare
the current (most recent) color of the alert, but the worst color the
alert has gone through.

So when you have a filter like

  MAIL user-4565121bb113@xymon.invalid COLOR=red RECOVERED

and the service goes through the sequence

  green -> red -> yellow -> green

you not only get a critical mail on red, but also a recovery mail on
green.

Or is the existing behaviour (without the patch) really intentional, and
some people actually prefer to get only the critical mail without any
recovery mail in this and similar cases? If so, why?

Daniel