Xymon Mailing List Archive search

Sanity checking my alert config

4 messages in this thread

list Shawn Heisey · Wed, 28 Jul 2021 18:20:54 -0600 ·
In alerts.cfg, I have this:

HOST=* COLOR=red,purple SERVICE=msgs
         MAIL $ADMINMAIL
HOST=* COLOR=red,purple
         IGNORE SERVICE=msgs
         MAIL $ADMINMAIL RECOVERED

My goal with this is to see alerts for purple and red statuses for all tests, and then get recovery alerts for all tests except msgs.  Did I do the config right for that?

I would also like to see if I can get a legend for the alerting grid in the "info" page.  Here's a screenshot showing what the above config produces:

https://www.dropbox.com/s/6wvutxctcn4uztf/xymon-alerts-elyograg.png?dl=0

I'd like to know what "(R)" and "(S)" mean, and what the line where the email address is blue means.

Thanks,
Shawn
list Jeremy Laidman · Thu, 29 Jul 2021 11:56:21 +1000 ·
Shawn

I can't comment on your alerts config, as I rarely use them, so I'm not
familiar with the syntax.

The (R) and (S) are alert codes. R means "recovered" and S means "stopped"
(matched a STOP rule).

I couldn't find the details in any documentation, but the source code
reveals all! There are other codes, and they would be listed in a code
string (not necessarily just a single character) separated by commas,
within the parentheses. Here's what the code says:

A = no alerts
R = recovered (alert has recovered from an alert state)
N = notice (matches NOTICE rule because message is a "notify" message)
S = stoprule (matches a rule with a STOP recipient)
U = unmatched

Hope that helps
quoted from Shawn Heisey

On Thu, 29 Jul 2021 at 10:21, Shawn Heisey <user-5d0d01dba542@xymon.invalid> wrote:
In alerts.cfg, I have this:

HOST=* COLOR=red,purple SERVICE=msgs
         MAIL $ADMINMAIL
HOST=* COLOR=red,purple
         IGNORE SERVICE=msgs
         MAIL $ADMINMAIL RECOVERED

My goal with this is to see alerts for purple and red statuses for all
tests, and then get recovery alerts for all tests except msgs.  Did I do
the config right for that?

I would also like to see if I can get a legend for the alerting grid in
the "info" page.  Here's a screenshot showing what the above config
produces:

https://www.dropbox.com/s/6wvutxctcn4uztf/xymon-alerts-elyograg.png?dl=0

I'd like to know what "(R)" and "(S)" mean, and what the line where the
email address is blue means.

Thanks,
Shawn

list Adam Thorn · Thu, 29 Jul 2021 11:14:46 +0100 ·
quoted from Shawn Heisey
On 29/07/2021 01:20, Shawn Heisey wrote:
In alerts.cfg, I have this:

HOST=* COLOR=red,purple SERVICE=msgs
 ??????? MAIL $ADMINMAIL
HOST=* COLOR=red,purple
 ??????? IGNORE SERVICE=msgs
 ??????? MAIL $ADMINMAIL RECOVERED

My goal with this is to see alerts for purple and red statuses for all tests, and then get recovery alerts for all tests except msgs.? Did I do the config right for that?
The hostname for HOST= should be either a simple string, or a perl-compatible regex (prefixed with a % to indicate that). Thus to match all hostnames you could use

HOST=%.*

Your COLOR=red,purple rule appears to just be for the msgs service but you said you want alerts for all tests. The same comment applies; you can have a rule filter all services with

SERVICE=%.*

but note that a rule in alerts.cfg "consists of one or more filters", i.e. a rule for

COLOR=red,purple MAIL $ADMIN

is valid due to having a single COLOR filter, and will implicitly match all HOSTs and SERVICEs as none are specified.

Your second rule won't do what you want; IGNORE and MAIL are both "recipients" of the matched rule. The IGNORE recipient causes rule processing to stop: "when the IGNORE recipient is matched, no more recipients will be considered". I don't use RECOVERED but I think you want something like

RECOVERED EXSERVICE=msgs MAIL $ADMIN

See man alerts.cfg and in particular the sections describing "RULES" and then "RECIPIENTS".

You can test an alerts.cfg file by running:

/usr/lib/xymon/server/bin/xymond_alert  --config=/tmp/test-alerts.cfg --test hostname.example.com msgs --color=red --duration=60

which will show you which, if any, of your rules would match and fire an alert. I don't know if you can easily test the RECOVERED rule like that though.

(Hmm. Actually, trying that with a rule for HOSTS=* seems to match just as well as HOSTS=%.* even though the man page suggests to me it shouldn't)

Adam
list Shawn Heisey · Thu, 29 Jul 2021 09:38:27 -0600 ·
quoted from Adam Thorn
On 7/29/2021 4:14 AM, Adam Thorn wrote:
Your second rule won't do what you want; IGNORE and MAIL are both "recipients" of the matched rule. The IGNORE recipient causes rule processing to stop: "when the IGNORE recipient is matched, no more recipients will be considered". I don't use RECOVERED but I think you want something like

RECOVERED EXSERVICE=msgs MAIL $ADMIN
I think I was able to work this out with the awesome info you provided:

COLOR=red,purple SERVICE=msgs
 ??????? MAIL $ADMINMAIL

COLOR=red,purple EXSERVICE=msgs
 ??????? MAIL $ADMINMAIL RECOVERED

Now on the info page, all the tests only have one line in the recipient column, and they all have (R) except msgs.? I think this is precisely what I am aiming for.? At the moment I do not have xymon monitoring any logfiles, but I will eventually think of something for it to alert on.

Thanks,
Shawn