Xymon Mailing List Archive search

Can't see my alert in the "info" column

8 messages in this thread

list Frédéric Mangeant · Tue, 15 Feb 2005 11:57:03 +0100 ·
Hi all

I'm playing with alerts and Hobbit 4.0-rc2, and I must say the ease of use
is fantastic !
The only problem is that I can't see my alerts in the "info" colum.

My $BBHOME/etc/hobbit-alerts.cfg contains this :

HOST=foo TIME=W:0900:1800
        SCRIPT /tmp/alerte.sh SERVICE=* EXSERVICE=disk,mem,procs
        SCRIPT /tmp/alerte.sh SERVICE=disk DURATION>5m REPEAT=2h
        SCRIPT /tmp/alerte.sh SERVICE=mem COLOR=yellow REPEAT=24h
        SCRIPT /tmp/alerte.sh SERVICE=procs TIME=*:1145:1150 REPEAT=24h

Alerts work fine, but I have this in the "info" column :

"No e-mail/SMS alerting defined"

Any hint ?

Thanks in advance.

Regards,

-- 

Frédéric Mangeant
list Henrik Størner · Tue, 15 Feb 2005 12:19:05 +0100 ·
quoted from Frédéric Mangeant
On Tue, Feb 15, 2005 at 11:57:03AM +0100, Frédéric Mangeant wrote:
Hi all

I'm playing with alerts and Hobbit 4.0-rc2, and I must say the ease of use
is fantastic !
Thanks :-)
The only problem is that I can't see my alerts in the "info" colum.
Known mis-feature. The "info" generator cannot handle the Hobbit alert
configuration right now.


Regards,
Henrik
list Frédéric Mangeant · Tue, 15 Feb 2005 14:36:26 +0100 ·
quoted from Henrik Størner
The only problem is that I can't see my alerts in the "info" colum.
Known mis-feature. The "info" generator cannot handle the 
Hobbit alert configuration right now.
Thanks for your answer. I've changed the subjet of this mail because I think
some alerts don't work as expected.

With this $BBHOME/etc/hobbit-alerts.cfg :
quoted from Frédéric Mangeant

HOST=foo TIME=W:0900:1800
        SCRIPT /tmp/alerte.sh SERVICE=* EXSERVICE=disk,mem,procs
        SCRIPT /tmp/alerte.sh SERVICE=disk DURATION>5m REPEAT=2h
        SCRIPT /tmp/alerte.sh SERVICE=mem COLOR=yellow REPEAT=24h

        SCRIPT /tmp/alerte.sh SERVICE=procs TIME=*:1145:1150,*:1205:1300
REPEAT=24h

I received these alerts :

15/02/2005      11:38:33 foo.mem = yellow (ACK :125406)
15/02/2005      11:39:33 foo.disk = red (ACK :419182)
15/02/2005      11:45:13 foo.procs = red (ACK :992240)
15/02/2005      12:03:46 foo.procs = red (ACK :143469)
15/02/2005      13:39:50 foo.disk = red (ACK :78043)
15/02/2005      14:03:50 foo.procs = red (ACK :408423)
15/02/2005      14:09:50 foo.cpu = yellow (ACK :844373)
15/02/2005      14:09:50 foo.cpu = yellow (ACK :844373)
15/02/2005      14:11:50 foo.cpu = yellow (ACK :589672)
15/02/2005      14:11:50 foo.cpu = yellow (ACK :589672)

I think there are 2 false errors :
- for each 'foo.cpu' alert I got paged twice, with the same ACK code.
- I shouldn't have been paged between 11h05 and 12h05, nor after 13h00, for
'foo.procs'

Any clue ?

Thanks...

-- 

Frédéric Mangeant
list Henrik Størner · Tue, 15 Feb 2005 13:55:01 +0000 (UTC) ·
In <user-648014b9477f@xymon.invalid> Fr�d�ric Mangeant <user-b6ea1d850181@xymon.invalid> writes:
some alerts don't work as expected.
[snip config and summary of sent alerts]
quoted from Frédéric Mangeant
I think there are 2 false errors :
- for each 'foo.cpu' alert I got paged twice, with the same ACK code.
- I shouldn't have been paged between 11h05 and 12h05, nor after 13h00, for
'foo.procs'
Could you try running "bbcmd hobbitd_alert --test foo cpu" ?

Also, if you add the option "--cfid" to the hobbitd_alert commandline
in hobbitlaunch.cfg, it will include the linenumber of the
hobbit-alerts.cfg file with each alert. That should make it easier to
track down what rules trigger an alert.


Regards,
Henrik
list Henrik Størner · Tue, 15 Feb 2005 14:00:51 +0000 (UTC) ·
quoted from Henrik Størner
In <cusuvl$s7c$user-e356fad9864f@xymon.invalid> Henrik Storner <user-ce4a2c883f75@xymon.invalid> writes:
Also, if you add the option "--cfid" to the hobbitd_alert commandline
in hobbitlaunch.cfg, it will include the linenumber of the
hobbit-alerts.cfg file with each alert. That should make it easier to
track down what rules trigger an alert.
I just noticed this won't work for SCRIPT recipients, because it's put
in the message subject which scripts ignore. So drop that.


Henrik
list Frédéric Mangeant · Tue, 15 Feb 2005 15:04:56 +0100 ·
quoted from Henrik Størner
I think there are 2 false errors :
- for each 'foo.cpu' alert I got paged twice, with the same ACK code.
- I shouldn't have been paged between 11h05 and 12h05, nor 
after 13h00, 
for 'foo.procs'
Could you try running "bbcmd hobbitd_alert --test foo cpu" ?
Of course :

$ $BBHOME/bin/bbcmd hobbitd_alert --test foo cpu
2005-02-15 14:59:22 Using default environment file ../etc/hobbitserver.cfg
Matching host:service:page 'foo:cpu:' against rule line 115:Matched
    *** Match with 'HOST=foo TIME=W:0900:1800' ***
Matching host:service:page 'foo:cpu:' against rule line 116:Matched
    *** Match with 'SCRIPT /tmp/alerte.sh SERVICE=*
EXSERVICE=disk,mem,procs' ***
Script alert with command '/tmp/alerte.sh' and recipient SERVICE=*
Matching host:service:page 'foo:cpu:' against rule line 117:Failed (min.
duration)
Matching host:service:page 'foo:cpu:' against rule line 118:Failed (color)
Matching host:service:page 'foo:cpu:' against rule line 119:Failed (time
criteria)

Here are lines 115 to 119 of my $BBHOME/etc/hobbit-alerts.cfg :

115 HOST=foo TIME=W:0900:1800
116         SCRIPT /tmp/alerte.sh SERVICE=* EXSERVICE=disk,mem,procs
117         SCRIPT /tmp/alerte.sh SERVICE=disk DURATION>5m REPEAT=2h
118         SCRIPT /tmp/alerte.sh SERVICE=mem COLOR=yellow REPEAT=24h
119         SCRIPT /tmp/alerte.sh SERVICE=procs TIME=*:1145:1150,*:1205:1300
REPEAT=24h
quoted from Henrik Størner
Also, if you add the option "--cfid" to the hobbitd_alert 
commandline in hobbitlaunch.cfg, it will include the 
linenumber of the hobbit-alerts.cfg file with each alert. 
That should make it easier to track down what rules trigger an alert.
Done.
quoted from Henrik Størner
I just noticed this won't work for SCRIPT recipients, because it's put in
the message subject which scripts ignore. So drop that.
Undone ;-)

Regards,

-- 

Frédéric Mangeant
list Henrik Størner · Wed, 16 Feb 2005 12:55:10 +0100 ·
quoted from Frédéric Mangeant
On Tue, Feb 15, 2005 at 02:36:26PM +0100, Frédéric Mangeant wrote:
I think there are 2 false errors :
- for each 'foo.cpu' alert I got paged twice, with the same ACK code.
- I shouldn't have been paged between 11h05 and 12h05, nor after 13h00, for
'foo.procs'
I've tried, but I cannot make this happen on my own setup.

Could you send me the script you use for alerting, and the
~hobbit/data/ack/notifications.log file ?


Regards,
Henrik
list Frédéric Mangeant · Wed, 16 Feb 2005 15:28:11 +0100 ·
Hi Henrik
quoted from Henrik Størner
I've tried, but I cannot make this happen on my own setup.

Could you send me the script you use for alerting, and the 
~hobbit/data/ack/notifications.log file ?
Well, I moved to another server, on which I cleanly installed Hobbit 4.0-rc2
+ patches, and can't seem to reproduice the problem.

Anyway, here's my tiny paging script :

$ cat /tmp/alert.sh
#!/bin/sh

DATE=`date +%d/%m/%Y%t%H:%M:%S`
echo "$DATE $BBHOSTNAME.$BBSVCNAME = $BBCOLORLEVEL (ack : $ACKCODE,
recovered : $RECOVERED)" >> /tmp/alert.txt


I did some more testing, there seems to be 2 small problems :


1) Warning when the format of a script is missing

With this rule :

$ cat $BBHOME/etc/hobbit-alerts.cfg
HOST=fmangeant SERVICE=* EXSERVICE=procs,disk,mem,svcs REPEAT=24h
TIME=W:0900:1800 SCRIPT /tmp/alert.sh FORMAT=TEXT
HOST=fmangeant SERVICE=disk DURATION>2m SCRIPT /tmp/alert.sh

I get a warning :

$ $BBHOME/bin/bbcmd hobbitd_alert --test fmangeant disk
2005-02-16 15:22:03 Using default environment file
/BB/hobbit/server/etc/hobbitserver.cfg
2005-02-16 15:22:03 Ignoring SCRIPT with no recipient at line 2
Matching host:service:page 'fmangeant:disk:' against rule line 1:Failed
(service excluded)
Matching host:service:page 'fmangeant:disk:' against rule line 2:Failed
(min. duration)

If I add the format of the script, like this :

$ cat $BBHOME/etc/hobbit-alerts.cfg
HOST=fmangeant SERVICE=* EXSERVICE=procs,disk,mem,svcs REPEAT=24h
TIME=W:0900:1800 SCRIPT /tmp/alert.sh FORMAT=TEXT
HOST=fmangeant SERVICE=disk DURATION>2m SCRIPT /tmp/alert.sh FORMAT=text

$ $BBHOME/bin/bbcmd hobbitd_alert --test fmangeant disk
2005-02-16 15:22:54 Using default environment file
/BB/hobbit/server/etc/hobbitserver.cfg
Matching host:service:page 'fmangeant:disk:' against rule line 1:Failed
(service excluded)
Matching host:service:page 'fmangeant:disk:' against rule line 2:Failed
(min. duration)


2) Repeat interval not correctly taken into account

I tried to repeat an alert every 5 minutes :

$ cat $BBHOME/etc/hobbit-alerts.cfg
HOST=fmangeant SERVICE=* EXSERVICE=procs,disk,mem,svcs REPEAT=24h
TIME=W:0900:1800 SCRIPT /tmp/alert.sh FORMAT=TEXT
HOST=fmangeant SERVICE=disk DURATION>2m SCRIPT /tmp/alert.sh FORMAT=TEXT
HOST=fmangeant SERVICE=procs REPEAT=5m SCRIPT /tmp/alert.sh FORMAT=TEXT

$ $BBHOME/bin/bbcmd hobbitd_alert --test fmangeant procs
2005-02-16 15:23:59 Using default environment file
/BB/hobbit/server/etc/hobbitserver.cfg
Matching host:service:page 'fmangeant:procs:' against rule line 1:Failed
(service excluded)
Matching host:service:page 'fmangeant:procs:' against rule line 2:Failed
(min. duration)
Matching host:service:page 'fmangeant:procs:' against rule line 3:Matched
    *** Match with 'HOST=fmangeant SERVICE=procs REPEAT=5m SCRIPT
/tmp/alert.sh FORMAT=TEXT' ***
Script alert with command '/tmp/alert.sh' and recipient FORMAT=TEXT

But I got paged every 30 minutes :

$ cat /tmp/alert.txt
16/02/2005      14:43:27 fmangeant.procs = red (ack : 145155, recovered : 0)
16/02/2005      15:13:30 fmangeant.procs = red (ack : 145155, recovered : 0)

Is it possible to use any repeat value ?

Thanks in advance.

Regards,

-- 

Frédéric Mangeant