Xymon Mailing List Archive search

Alerting - I'm not doing it right...

list Henrik Størner
Thu, 15 Dec 2011 12:36:12 +0100
Message-Id: <61ddc1356d84ddbbf98144a2e35fbdf4@localhost>

On Thu, 15 Dec 2011 10:02:43 +0000, Carl Inglis <user-96685bdc864b@xymon.invalid>
wrote:
alerts.cfg

$EMAIL_ALERT=user-96685bdc864b@xymon.invalid
$LIN_WINDOWS_PROBLEMS=$EMAIL_ALERT

HOST=%lin(.*) SERVICE=%win(.*)
        MAIL $LIN_WINDOWS_PROBLEMS REPEAT=24h DURATION>1d RECOVERED STOP

HOST=* EXPAGE=printers
        MAIL $EMAIL_ALERT REPEAT=1h RECOVERED UNMATCHED STOP

When the host "lin-apps-01" has a yellow alert on it's "winUpdates"
services, I expect it to shout about it once every 24h. It is, however,
shouting about it once every hour.
There may be some confusion about "service" here. 

When you refer to "winUpdates" - is that a status-column in Xymon, or a
Windows Service that you are monitoring with a client on the Windows
machine? The latter would typically show up in a "svcs" (services) status
column on Xymon.

The SERVICE=... setting in alerts.cfg refer to the status-column, not a
Windows service. So to catch a "Windows updates" service that is not
running, you would have 'SERVICE=svcs' in alerts.cfg.

What the first part of your alerts.cfg says, is "if you have a host whose
name contains 'lin', and that host has a status-column that contains 'win',
then send an alert after 1 day, and repeat every 24 hours".

The second part of your configuration says "Any status that has an error -
except those on the 'printers' page, and those handled by other rules -
trigger an alert that is repeated once an hour". Pretty broad definition, I
think.


Hope that removes a bit of confusion.


Regards,
Henrik