Xymon Mailing List Archive search

Alerting - I'm not doing it right...

5 messages in this thread

list Carl Inglis · Thu, 15 Dec 2011 10:02:43 +0000 ·
Hi folks,

I'm sure this is going to be something really silly that I've missed - but I've been pulling my hair out over this one for a couple of days now.

alerts.cfg

$EMAIL_ALERT=user-96685bdc864b@xymon.invalid
$LIN_WINDOWS_PROBLEMS=$EMAIL_ALERT

HOST=%lin(.*) SERVICE=%win(.*)
        MAIL $LIN_WINDOWS_PROBLEMS REPEAT=24h DURATION>1d RECOVERED STOP

HOST=* EXPAGE=printers
        MAIL $EMAIL_ALERT REPEAT=1h RECOVERED UNMATCHED STOP

When the host "lin-apps-01" has a yellow alert on it's "winUpdates" services, I expect it to shout about it once every 24h. It is, however, shouting about it once every hour.

It's clear that the first HOST line is being ignored - I suspect my regex is incorrect in some way.

Any thoughts or pointers would be appreciated.

Regards,

Carl

[Rakon Logo]

Carl Inglis
Systems Administrator

Rakon UK Limited
Dowsett House, Sadler Road, Lincoln LN6 3RS, United Kingdom
Tel: +XX (X)XXXX XXXXXX | Fax:+XX (X) XXXX XXXXXX | Mob: +44 (0) 7786 552915
user-96685bdc864b@xymon.invalid | www.rakon.com

[Winner of the NZ Hi-Tech Awards 2011 - Hi-Tech Company of the Decade]

Winner of the 2010 Lincolnshire Business of the Year Award

This message together with any attachments contains confidential information and may be
subject to privilege. If you are not the intended recipient you may not distribute it in any
way, you must notify the sender immediately and delete any copies of the message along
with its attachments.

Rakon UK Ltd is a limited company registered in England and Wales.
Registered Office: Dowsett House, Sadler Road, Lincoln LN6 3RS
Company Registration Number: 5128090.

Please be aware that Rakon UK Limited may monitor email traffic data
including the date, time, subject line, sender and recipients for the
purposes of security and usage monitoring. Automated monitoring
systems may also be applied to ascertain whether incoming/outgoing
emails are likely to contain viruses, other destructive devices or
inappropriate content.
list Johan Sjöberg · Thu, 15 Dec 2011 10:30:31 +0000 ·
Hi.

How does it look in the "Info" page for that server? Both alert lines would match that server, thus giving you an alert every hour, with an extra e-mail once per day, after a day, unless the server is in the printers page.
If you don't want the second alert line to match that server/test, you need to exclude it.

/Johan
quoted from Carl Inglis

From: xymon-bounces at xymon.com [mailto:xymon-bounces at xymon.com] On Behalf Of Carl Inglis
Sent: den 15 december 2011 11:03
To: xymon at xymon.com
Subject: [Xymon] Alerting - I'm not doing it right...

Hi folks,

I'm sure this is going to be something really silly that I've missed - but I've been pulling my hair out over this one for a couple of days now.

alerts.cfg

$EMAIL_ALERT=user-96685bdc864b@xymon.invalid<mailto:$EMAIL_ALERT=user-96685bdc864b@xymon.invalid>
quoted from Carl Inglis
$LIN_WINDOWS_PROBLEMS=$EMAIL_ALERT

HOST=%lin(.*) SERVICE=%win(.*)
        MAIL $LIN_WINDOWS_PROBLEMS REPEAT=24h DURATION>1d RECOVERED STOP

HOST=* EXPAGE=printers
        MAIL $EMAIL_ALERT REPEAT=1h RECOVERED UNMATCHED STOP

When the host "lin-apps-01" has a yellow alert on it's "winUpdates" services, I expect it to shout about it once every 24h. It is, however, shouting about it once every hour.

It's clear that the first HOST line is being ignored - I suspect my regex is incorrect in some way.

Any thoughts or pointers would be appreciated.

Regards,

Carl

[Rakon Logo]

Carl Inglis
Systems Administrator

Rakon UK Limited
Dowsett House, Sadler Road, Lincoln LN6 3RS, United Kingdom
Tel: +XX (X)XXXX XXXXXX | Fax:+XX (X) XXXX XXXXXX | Mob: +44 (0) 7786 552915

user-96685bdc864b@xymon.invalid<mailto:user-96685bdc864b@xymon.invalid> | www.rakon.com<http://www.rakon.com>;
quoted from Carl Inglis

[Winner of the NZ Hi-Tech Awards 2011 - Hi-Tech Company of the Decade]

Winner of the 2010 Lincolnshire Business of the Year Award

This message together with any attachments contains confidential information and may be
subject to privilege. If you are not the intended recipient you may not distribute it in any
way, you must notify the sender immediately and delete any copies of the message along
with its attachments.

Rakon UK Ltd is a limited company registered in England and Wales.
Registered Office: Dowsett House, Sadler Road, Lincoln LN6 3RS
Company Registration Number: 5128090.

Please be aware that Rakon UK Limited may monitor email traffic data
including the date, time, subject line, sender and recipients for the
purposes of security and usage monitoring. Automated monitoring
systems may also be applied to ascertain whether incoming/outgoing
emails are likely to contain viruses, other destructive devices or
inappropriate content.
list Carl Inglis · Thu, 15 Dec 2011 11:18:08 +0000 ·
Hi,

You may have a point on the delay criteria. I'll try taking that out.

Thanks!
quoted from Johan Sjöberg

Carl


[Rakon Logo]

Carl Inglis
Systems Administrator

Rakon UK Limited
Dowsett House, Sadler Road, Lincoln LN6 3RS, United Kingdom
Tel: +XX (X)XXXX XXXXXX | Fax:+XX (X) XXXX XXXXXX | Mob: +44 (0) 7786 552915
user-96685bdc864b@xymon.invalid | www.rakon.com

Winner of the 2010 Lincolnshire Business of the Year Award

This message together with any attachments contains confidential information and may be
subject to privilege. If you are not the intended recipient you may not distribute it in any
way, you must notify the sender immediately and delete any copies of the message along
with its attachments.
From: Johan Sjöberg [mailto:user-74c177c1220d@xymon.invalid]
Sent: 15 December 2011 11:15
To: Carl Inglis
Subject: RE: Alerting - I'm not doing it right...

Hi.

Maybe it doesn't count as received if the delay of the first line hasn't been exceeded yet?
I have to admit I haven't even seen this UNMATCHED feature before, so I thought all matching lines would generate an alert.

/Johan

From: Carl Inglis [mailto:user-96685bdc864b@xymon.invalid]
Sent: den 15 december 2011 12:12
To: Johan Sjöberg
Subject: RE: Alerting - I'm not doing it right...

Hi Johan,

The relevant section of the Info page looks like this:

winUpdates       user-96685bdc864b@xymon.invalid<mailto:user-96685bdc864b@xymon.invalid> (R,S)         1d           -              1d           -              purple,yellow,red
user-96685bdc864b@xymon.invalid<mailto:user-96685bdc864b@xymon.invalid> (R,S,U)    -              -              1h           -              purple,yellow,red

It would seem that I have misunderstood the UNMATCHED entry then.

The man page says "The alert is sent to this recipient ONLY if no other recipients received an alert for this event. "

I thought that it meant that for any given pass through alerts.cfg ("this event") the UNMATCHED recipient would only be emailed if the event hadn't triggered any other recipient.

Regards,

Carl
quoted from Johan Sjöberg


From: xymon-bounces at xymon.com<mailto:xymon-bounces at xymon.com> [mailto:xymon-bounces at xymon.com]<mailto:[mailto:xymon-bounces at xymon.com]> On Behalf Of Johan Sjöberg
Sent: 15 December 2011 10:31
To: xymon at xymon.com<mailto:xymon at xymon.com>
Subject: Re: [Xymon] Alerting - I'm not doing it right...

Hi.

How does it look in the "Info" page for that server? Both alert lines would match that server, thus giving you an alert every hour, with an extra e-mail once per day, after a day, unless the server is in the printers page.
If you don't want the second alert line to match that server/test, you need to exclude it.

/Johan

From: xymon-bounces at xymon.com<mailto:xymon-bounces at xymon.com> [mailto:xymon-bounces at xymon.com] On Behalf Of Carl Inglis
Sent: den 15 december 2011 11:03
To: xymon at xymon.com<mailto:xymon at xymon.com>
Subject: [Xymon] Alerting - I'm not doing it right...

Hi folks,

I'm sure this is going to be something really silly that I've missed - but I've been pulling my hair out over this one for a couple of days now.

alerts.cfg

$EMAIL_ALERT=user-96685bdc864b@xymon.invalid<mailto:$EMAIL_ALERT=user-96685bdc864b@xymon.invalid>
$LIN_WINDOWS_PROBLEMS=$EMAIL_ALERT

HOST=%lin(.*) SERVICE=%win(.*)
        MAIL $LIN_WINDOWS_PROBLEMS REPEAT=24h DURATION>1d RECOVERED STOP

HOST=* EXPAGE=printers
        MAIL $EMAIL_ALERT REPEAT=1h RECOVERED UNMATCHED STOP

When the host "lin-apps-01" has a yellow alert on it's "winUpdates" services, I expect it to shout about it once every 24h. It is, however, shouting about it once every hour.

It's clear that the first HOST line is being ignored - I suspect my regex is incorrect in some way.

Any thoughts or pointers would be appreciated.

Regards,

Carl


[Winner of the NZ Hi-Tech Awards 2011 - Hi-Tech Company of the Decade]

Rakon UK Ltd is a limited company registered in England and Wales.
Registered Office: Dowsett House, Sadler Road, Lincoln LN6 3RS
Company Registration Number: 5128090.

Please be aware that Rakon UK Limited may monitor email traffic data
including the date, time, subject line, sender and recipients for the
purposes of security and usage monitoring. Automated monitoring
systems may also be applied to ascertain whether incoming/outgoing
emails are likely to contain viruses, other destructive devices or
inappropriate content.
list Henrik Størner · Thu, 15 Dec 2011 12:36:12 +0100 ·
On Thu, 15 Dec 2011 10:02:43 +0000, Carl Inglis <user-96685bdc864b@xymon.invalid>
quoted from Carl Inglis
wrote:
alerts.cfg

$EMAIL_ALERT=user-96685bdc864b@xymon.invalid
$LIN_WINDOWS_PROBLEMS=$EMAIL_ALERT

HOST=%lin(.*) SERVICE=%win(.*)
        MAIL $LIN_WINDOWS_PROBLEMS REPEAT=24h DURATION>1d RECOVERED STOP

HOST=* EXPAGE=printers
        MAIL $EMAIL_ALERT REPEAT=1h RECOVERED UNMATCHED STOP

When the host "lin-apps-01" has a yellow alert on it's "winUpdates"
services, I expect it to shout about it once every 24h. It is, however,
shouting about it once every hour.
There may be some confusion about "service" here. 

When you refer to "winUpdates" - is that a status-column in Xymon, or a
Windows Service that you are monitoring with a client on the Windows
machine? The latter would typically show up in a "svcs" (services) status
column on Xymon.

The SERVICE=... setting in alerts.cfg refer to the status-column, not a
Windows service. So to catch a "Windows updates" service that is not
running, you would have 'SERVICE=svcs' in alerts.cfg.

What the first part of your alerts.cfg says, is "if you have a host whose
name contains 'lin', and that host has a status-column that contains 'win',
then send an alert after 1 day, and repeat every 24 hours".

The second part of your configuration says "Any status that has an error -
except those on the 'printers' page, and those handled by other rules -
trigger an alert that is repeated once an hour". Pretty broad definition, I
think.


Hope that removes a bit of confusion.


Regards,
Henrik
list Carl Inglis · Thu, 15 Dec 2011 12:01:04 +0000 ·
quoted from Carl Inglis
 Carl Inglis
Systems Administrator

Rakon UK Limited
Dowsett House, Sadler Road, Lincoln LN6 3RS, United Kingdom
Tel: +XX (X)XXXX XXXXXX | Fax:+XX (X) XXXX XXXXXX | Mob: +44 (0) 7786 552915
user-96685bdc864b@xymon.invalid | www.rakon.com
Winner of the 2010 Lincolnshire Business of the Year Award

This message together with any attachments contains confidential information and may be
subject to privilege. If you are not the intended recipient you may not distribute it in any
way, you must notify the sender immediately and delete any copies of the message along
with its attachments.
-----Original Message-----
From: xymon-bounces at xymon.com [mailto:xymon-bounces at xymon.com] On

Behalf Of user-ce4a2c883f75@xymon.invalid
Sent: 15 December 2011 11:36
To: xymon at xymon.com
quoted from Henrik Størner

On Thu, 15 Dec 2011 10:02:43 +0000, Carl Inglis <user-96685bdc864b@xymon.invalid>
wrote:
alerts.cfg

$EMAIL_ALERT=user-96685bdc864b@xymon.invalid
$LIN_WINDOWS_PROBLEMS=$EMAIL_ALERT

HOST=%lin(.*) SERVICE=%win(.*)
        MAIL $LIN_WINDOWS_PROBLEMS REPEAT=24h DURATION>1d RECOVERED
STOP

HOST=* EXPAGE=printers
        MAIL $EMAIL_ALERT REPEAT=1h RECOVERED UNMATCHED STOP

When the host "lin-apps-01" has a yellow alert on it's "winUpdates"
services, I expect it to shout about it once every 24h. It is,
however, shouting about it once every hour.
There may be some confusion about "service" here.

When you refer to "winUpdates" - is that a status-column in Xymon, or a
Windows Service that you are monitoring with a client on the Windows
machine? The latter would typically show up in a "svcs" (services)
status column on Xymon.
It's a status column that's returned by a BBWIN ext script- it goes yellow if there are pending Windows Updates on that server.
quoted from Henrik Størner
The SERVICE=... setting in alerts.cfg refer to the status-column, not a
Windows service. So to catch a "Windows updates" service that is not
running, you would have 'SERVICE=svcs' in alerts.cfg.

What the first part of your alerts.cfg says, is "if you have a host
whose name contains 'lin', and that host has a status-column that
contains 'win', then send an alert after 1 day, and repeat every 24
hours".
Which is what I wanted it to do.
quoted from Henrik Størner
The second part of your configuration says "Any status that has an
error - except those on the 'printers' page, and those handled by other
rules - trigger an alert that is repeated once an hour". Pretty broad
definition, I think.
Indeed - I'm currently in development mode trying to finalise how we're going to do our alerting; the last line of the configuration was intended as a "you missed one" alert for me. There are a number of lines above the first line in my original email.
Hope that removes a bit of confusion.
It does indeed, thank you.

It appears that removing the "DURATION>1d" option has stopped the second rule for firing - which would make sense since (as Johan suggested) the first rule is unmatched until the alert has a duration of more than one day.

Is that interpretation correct?

Thanks,

Carl
quoted from Carl Inglis


Rakon UK Ltd is a limited company registered in England and Wales.
Registered Office: Dowsett House, Sadler Road, Lincoln LN6 3RS
Company Registration Number: 5128090.

Please be aware that Rakon UK Limited may monitor email traffic data including the date, time, subject line, sender and recipients for the purposes of security and usage monitoring. Automated monitoring systems may also be applied to ascertain whether incoming/outgoing emails are likely to contain viruses, other destructive devices or inappropriate content.