Xymon Mailing List Archive search

hobbitd_alert: Servers on multiple pages & PAGE= rules

list S Aiello
Thu, 31 May 2007 12:21:20 -0400
Message-Id: <user-8f63a68ea05c@xymon.invalid>

Digging back into this issue. That workaround would only be possible if I had 
a small # of devices and/or a small # of page based rules. If I have alot of 
devices the script would get called for every host. And with a lot these page 
based alert rules, that would be a large number of scripts be spawned for 
each alert.

But in desperation and I had a bit of time, I started to dig more into this. 
Even the hobbitdboard command does not show the devices. When I have the 
following:

Main Page:
 |-Page1
 |  +Host1a
 |  +Host1b
 |  +Host1c
 |  +Host2c
 |
 |-Page2
    +Host2a
    +Host2b
    +Host1a
    +Host2c

Now if I use the command, 'bin/bb 127.0.0.1 "hobbitdboard page=Page2 test=info 
fields=hostname"', I will get the results of, "Host2a, Host2b, Host2c". 
Host1a will not be shown.

So it seems that alerts, hobbitdboard, and the info page that shows what alert 
rules the device matches to are affected. The info page section, 
Page/subpage, does not appear to be affected.

I using my bb-hosts file from BigBrother. It uses the page & subpage tags. I 
do not use any subparent tags.  I am using Hobbit 4.2.0 with the 02/09/2007 
all-in-one patch.

Any other information I can collect to help resolve this problem ?
 ~Steve

On Tuesday 17 April 2007 16:12, John Glowacki wrote:
Here is an example for workaround. Call SCRIPT instead of MAIL and do
page filtering in a custom script. That might give you some ideas for
other workarounds until a reason for the problem is found.

hobbit-alerts.cfg:
HOST=* RECOVERED NOTICE
     SCRIPT /opt/hobbit/server/etc/alert.sh noc FORMAT=SCRIPT

alert.sh:
# Check if hostname is on page=test
bb 127.0.0.1 "hobbitdboard test=$BBSVCNAME page=^test fields=hostname" |
grep "^$BBHOSTNAME$"
if [ "$?" = "0" ]
then
    echo "Skip alert: `date` P $BBHOSTNAME.$BBSVCNAME" >>
/var/log/hobbit/skip_alert.log
    exit 0
fi
# send alert (mail,sms,etc)

John

user-ce96540ed38f@xymon.invalid wrote:
Since I have not seen a response I will assume there presently isn't a
known workaround. The only workaround I can come up with is the use of
MACROS in the hobbit-alerts.cfg. Basically using MACROS to define a group
of hosts, and then creating an alert rule with HOST=$MacroName.
$GROUPA=%HOSTA|HOSTB|HOSTC|HOSTF
HOST=$GROUPA

This would be similar to HostGroup (hg-) in BigBrother. I haven't tested
this yet. I was hoping that PAGE= would work for any page a Device is
listed on and not just the top level one. That would make management of
alert rules much easier and was one of the big features in Hobbit I was
looking forward to.

I still do not know if PAGE= only working for the top level Page listing
is a Bug or not. I did see a past mailing list article
(http://www.hobbitmon.com/hobbiton/2005/05/msg0t0211.html) and a patch to
fix hobbitd_alert --test option. I did not see that patch mentioned in
the all-in-one patch, nor was I able to apply the patch, there was an
error.

I believe this issue is also bigger then just the failure occuring during
the --test option. I do not see alert rules being applied in the info
page, nor do I receive alerts. I tried looking into the code, but was
unable to come up with an answer/fix myself.

Henrik, or anyone else, if you have time to dig into this, I would
appreciate it greatly.

 ~Steve

On Tuesday 10 April 2007 16:11, user-ce96540ed38f@xymon.invalid wrote:
Searching the Mailing list archives, I see a few others have experienced
the same problem. So is there a recommended work-around , or is this on
the todo list to be fixed.

Any information would be most welcomed, Thanks.
 ~Steve

On Friday 30 March 2007 16:30, user-ce96540ed38f@xymon.invalid wrote:
Lets say I have the following bb-hosts file:
page servers
subpage Web Web Servers
1.2.3.4		Web01.domain.com			#
1.2.3.5		Web02.domain.com			#
Subpage Other Other Web Servers
0.0.0.0		Web02.domain.com			#

And now I have the following hobbit-alerts.cfg:
PAGE=servers/Other SERVICE=conn
	MAIL user-f21f09270637@xymon.invalid COLOR=red,yellow

Now I run the command, "bin/bbcmd hobbitd_alert --test Web02.domain.com
conn", I see:
2007-03-30 15:09:38 Using default environment
file ..../server/etc/hobbitserver.cfg
00000897 2007-03-30 15:09:38 send_alert Web02.domain.com:conn state
Paging 00000897 2007-03-30 15:09:38 Matching
host:service:page 'Web02.domain.com:conn:server/Web' against rule line
119 00000897 2007-03-30 15:09:38 Failed 'PAGE=servers/Other
SERVICE=conn' (pagename not in include list)

So it seems when using PAGE= alert rules only honor the first page a
device is listed on. Is this by design ?