Xymon Mailing List Archive search

ALERTS time of day problem

6 messages in this thread

list Mike Rowell · Thu, 25 May 2006 09:06:53 +0100 ·
Morning All (UK Time)

 
I've got an issue with the TIME function in hobbit-alerts which is
causing us a few problems.

 
Hobbit has been running fine and we've starting making finer grained
control on the alerts so that we don't get paged out of hours for stuff
that we don't need to deal with.  Below is the hobbit-alerts entries.
The problem is that last night at 03:30am we were called out for an
alert on the aq service which according to how the alert is setup should
only be paging us between 0900 and 1700 weekdays.  This is on 4.2PRE, we
have newer snapshots running in test but I don't want to upgrade this
platform unless I have to as it is running completely stable except for
this TIME problem.

 
Any ideas?

 
Mike Rowell

 
ALERTS------

 
PAGE=test

    MAIL=systems FORMAT=PLAIN TIME=W:0900:1800 COLOR=red,yellow STOP

 
HOST=%^switches.*

    MAIL=systems COLOR=red,yellow REPEAT=1h FORMAT=PLAIN

    MAIL=pager SERVICE=cpu COLOR=RED FORMAT=SMS DURATION>15 REPEAT=1h
TIME=*:0800:2200 STOP

 
HOST=*

    MAIL=systems SERVICE=mrtg COLOR=red,yellow FORMAT=PLAIN STOP

    MAIL=systems SERVICE=repli,prtdiag FORMAT=PLAIN REPEAT=1h
COLOR=red,yellow STOP

    MAIL=systems COLOR=red,yellow REPEAT=1h FORMAT=PLAIN

    MAIL=pager SERVICE=cpu COLOR=RED FORMAT=SMS DURATION>15 REPEAT=1h
STOP

    MAIL=pager SERVICE=aq COLOR=RED FORMAT=SMS DURATION>5 REPEAT=1h
TIME=W:0900:1700 STOP

    MAIL=pager COLOR=RED FORMAT=SMS DURATION>5 REPEAT=1h


This email has been scanned for all viruses by the MessageLabs service. 
list Pnixon · Thu, 25 May 2006 10:21:50 -0400 ·
Mike,
 Without really thinking about it, I know there's a way to test the alert
setup using a command (I think hobbitd_alert, I'm sure someone will chime in
soon).
 
 It will allow you to set the date, time, host, service, and color for the
test, which is detailed in the man page.
 
 Here's an example from an email last month.
 
 'bin/bbcmd hobbitd_alert --test whq-sapcon-1 telnet DURATION=361'
 
--Pat
quoted from Mike Rowell

 
From: Mike Rowell [mailto:user-63f3e97eb1de@xymon.invalid] 
Sent: Thursday, May 25, 2006 4:07 AM
To: user-ae9b8668bcde@xymon.invalid
Subject: [hobbit] ALERTS time of day problem


Morning All (UK Time)

 
I've got an issue with the TIME function in hobbit-alerts which is causing
us a few problems.

 
Hobbit has been running fine and we've starting making finer grained control
on the alerts so that we don't get paged out of hours for stuff that we
don't need to deal with.  Below is the hobbit-alerts entries.  The problem
is that last night at 03:30am we were called out for an alert on the aq
service which according to how the alert is setup should only be paging us
between 0900 and 1700 weekdays.  This is on 4.2PRE, we have newer snapshots
running in test but I don't want to upgrade this platform unless I have to
as it is running completely stable except for this TIME problem.

 
Any ideas?

 
Mike Rowell

 
ALERTS------

 
PAGE=test

    MAIL=systems FORMAT=PLAIN TIME=W:0900:1800 COLOR=red,yellow STOP

 
HOST=%^switches.*

    MAIL=systems COLOR=red,yellow REPEAT=1h FORMAT=PLAIN

    MAIL=pager SERVICE=cpu COLOR=RED FORMAT=SMS DURATION>15 REPEAT=1h
TIME=*:0800:2200 STOP

 
HOST=*

    MAIL=systems SERVICE=mrtg COLOR=red,yellow FORMAT=PLAIN STOP

    MAIL=systems SERVICE=repli,prtdiag FORMAT=PLAIN REPEAT=1h
COLOR=red,yellow STOP

    MAIL=systems COLOR=red,yellow REPEAT=1h FORMAT=PLAIN

    MAIL=pager SERVICE=cpu COLOR=RED FORMAT=SMS DURATION>15 REPEAT=1h STOP

    MAIL=pager SERVICE=aq COLOR=RED FORMAT=SMS DURATION>5 REPEAT=1h
TIME=W:0900:1700 STOP

    MAIL=pager COLOR=RED FORMAT=SMS DURATION>5 REPEAT=1h


This email has been scanned for all viruses by the MessageLabs service. 
list Henrik Størner · Sun, 28 May 2006 17:17:04 +0200 ·
quoted from Mike Rowell
On Thu, May 25, 2006 at 09:06:53AM +0100, Mike Rowell wrote:
Hobbit has been running fine and we've starting making finer grained
control on the alerts so that we don't get paged out of hours for stuff
that we don't need to deal with.  Below is the hobbit-alerts entries.
The problem is that last night at 03:30am we were called out for an
alert on the aq service which according to how the alert is setup should
only be paging us between 0900 and 1700 weekdays.  This is on 4.2PRE, we
have newer snapshots running in test but I don't want to upgrade this
platform unless I have to as it is running completely stable except for
this TIME problem.
Please add "--cfid" to the hobbitd_alert CMD line in hobbitlaunch.cfg.
Next time this happens, the subject of the alert will include a
"[cfid:NUMBER]" text which is the line-number of your hobbit-alerts.cfg
configuration that triggered this alert.

Also, it would be interesting to see your notifications.log entries for
these alerts.


Regards,
Henrik
list Mike Rowell · Mon, 29 May 2006 18:17:59 +0100 ·
Henrik,

Thanks for the reply, the alert is being picked up on the following line

    MAIL=support SERVICE=aq COLOR=RED FORMAT=SMS DURATION>5 REPEAT=1h
TIME=W:0900:1700 STOP

Which is line 130 of the hobbit-alert.cfg file.  The thing is that it
just alerted me at 18:10, which is outside of the TIME setting for this
entry.  Any ideas as to what to try next?  Here's the notifications log
entries for that alert

Mon May 29 18:11:05 2006 server.aq (10.8.2.1) systems 1148922665 0
Mon May 29 18:11:05 2006 server.aq (10.8.2.1) support 1148922665 0

Mike 
quoted from Henrik Størner
-----Original Message-----
From: Henrik Stoerner [mailto:user-ce4a2c883f75@xymon.invalid] Sent: 28 May 2006 16:17
To: user-ae9b8668bcde@xymon.invalid
Subject: Re: [hobbit] ALERTS time of day problem

On Thu, May 25, 2006 at 09:06:53AM +0100, Mike Rowell wrote:
Hobbit has been running fine and we've starting making finer grained control on the alerts so that we don't get paged out of hours for stuff that we don't need to deal with.  Below is the hobbit-alerts
entries.
The problem is that last night at 03:30am we were called out for an alert on the aq service which according to how the alert is setup should only be paging us between 0900 and 1700 weekdays.  This is on 4.2PRE, we have newer snapshots running in test but I don't want to upgrade this platform unless I have to as it is running completely stable except for this TIME problem.
Please add "--cfid" to the hobbitd_alert CMD line in hobbitlaunch.cfg.
Next time this happens, the subject of the alert will include a
"[cfid:NUMBER]" text which is the line-number of your hobbit-alerts.cfg
configuration that triggered this alert.

Also, it would be interesting to see your notifications.log entries for
these alerts.


Regards,
Henrik


This email has been scanned for all viruses by the MessageLabs service.

This email has been scanned for all viruses by the MessageLabs service. ________________________________________________________________________
list Henrik Størner · Mon, 29 May 2006 21:54:53 +0200 ·
quoted from Mike Rowell
On Mon, May 29, 2006 at 06:17:59PM +0100, Mike Rowell wrote:
Henrik,

Thanks for the reply, the alert is being picked up on the following line

    MAIL=support SERVICE=aq COLOR=RED FORMAT=SMS DURATION>5 REPEAT=1h
TIME=W:0900:1700 STOP

Which is line 130 of the hobbit-alert.cfg file.  The thing is that it
just alerted me at 18:10, which is outside of the TIME setting for this
entry.  Any ideas as to what to try next?  Here's the notifications log
entries for that alert

Mon May 29 18:11:05 2006 server.aq (10.8.2.1) systems 1148922665 0
Mon May 29 18:11:05 2006 server.aq (10.8.2.1) support 1148922665 0
You mentioned you were running the 4.2pre version, which is a rather
early snapshot of the 4.2 release. I do believe this has been fixed
in the current code; what happens is that basically the alert code
would ignore the general criteria like TIME when used together with a
MAIL setting.

So - I'd like you to try it out with the current snapshot (which will be
a beta-release in a day or two).


Regards,
Henrik
list Mike Rowell · Mon, 29 May 2006 22:09:21 +0100 ·
Henrik,

Thought it might be something like that, as a temporary fix I've created
three different versions of the hobbit-alerts.cfg file and set cronjob's
to move them into place as and when they are required.

If we've got a beta due I'll wait for that, try it on my dev platform
first if it's okay for a day I'll move it to the test (which is actually
the redundant BB server), then to live, probably early week after the
beta.
quoted from Henrik Størner

Mike 

-----Original Message-----
From: Henrik Stoerner [mailto:user-ce4a2c883f75@xymon.invalid] 
Sent: 29 May 2006 20:55
To: user-ae9b8668bcde@xymon.invalid
Subject: Re: [hobbit] ALERTS time of day problem

On Mon, May 29, 2006 at 06:17:59PM +0100, Mike Rowell wrote:
Henrik,

Thanks for the reply, the alert is being picked up on the following 
line

    MAIL=support SERVICE=aq COLOR=RED FORMAT=SMS DURATION>5 REPEAT=1h 
TIME=W:0900:1700 STOP

Which is line 130 of the hobbit-alert.cfg file.  The thing is that it 
just alerted me at 18:10, which is outside of the TIME setting for 
this entry.  Any ideas as to what to try next?  Here's the 
notifications log entries for that alert

Mon May 29 18:11:05 2006 server.aq (10.8.2.1) systems 1148922665 0 Mon
May 29 18:11:05 2006 server.aq (10.8.2.1) support 1148922665 0
You mentioned you were running the 4.2pre version, which is a rather
early snapshot of the 4.2 release. I do believe this has been fixed in
the current code; what happens is that basically the alert code would
ignore the general criteria like TIME when used together with a MAIL
setting.

So - I'd like you to try it out with the current snapshot (which will be
a beta-release in a day or two).


Regards,
Henrik


This email has been scanned for all viruses by the MessageLabs service.

This email has been scanned for all viruses by the MessageLabs service.