Xymon Mailing List Archive search

hobbitd_alert: Servers on multiple pages & PAGE= rules

list John Glowacki
Tue, 17 Apr 2007 16:12:34 -0400
Message-Id: <user-0c0f1828543d@xymon.invalid>

Here is an example for workaround. Call SCRIPT instead of MAIL and do
page filtering in a custom script. That might give you some ideas for
other workarounds until a reason for the problem is found.

hobbit-alerts.cfg:
HOST=* RECOVERED NOTICE
     SCRIPT /opt/hobbit/server/etc/alert.sh noc FORMAT=SCRIPT

alert.sh:
# Check if hostname is on page=test
bb 127.0.0.1 "hobbitdboard test=$BBSVCNAME page=^test fields=hostname" |
grep "^$BBHOSTNAME$"
if [ "$?" = "0" ]
then
    echo "Skip alert: `date` P $BBHOSTNAME.$BBSVCNAME" >>
/var/log/hobbit/skip_alert.log
    exit 0
fi
# send alert (mail,sms,etc)

John

user-ce96540ed38f@xymon.invalid wrote:
Since I have not seen a response I will assume there presently isn't a known workaround. The only workaround I can come up with is the use of MACROS in the hobbit-alerts.cfg. Basically using MACROS to define a group of hosts, and then creating an alert rule with HOST=$MacroName.
$GROUPA=%HOSTA|HOSTB|HOSTC|HOSTF
HOST=$GROUPA

This would be similar to HostGroup (hg-) in BigBrother. I haven't tested this yet. I was hoping that PAGE= would work for any page a Device is listed on and not just the top level one. That would make management of alert rules much easier and was one of the big features in Hobbit I was looking forward to.

I still do not know if PAGE= only working for the top level Page listing is a Bug or not. I did see a past mailing list article (http://www.hobbitmon.com/hobbiton/2005/05/msg0t0211.html) and a patch to fix hobbitd_alert --test option. I did not see that patch mentioned in the all-in-one patch, nor was I able to apply the patch, there was an error.

I believe this issue is also bigger then just the failure occuring during the --test option. I do not see alert rules being applied in the info page, nor do I receive alerts. I tried looking into the code, but was unable to come up with an answer/fix myself.

Henrik, or anyone else, if you have time to dig into this, I would appreciate it greatly.

 ~Steve


On Tuesday 10 April 2007 16:11, user-ce96540ed38f@xymon.invalid wrote:
Searching the Mailing list archives, I see a few others have experienced
the same problem. So is there a recommended work-around , or is this on the
todo list to be fixed.

Any information would be most welcomed, Thanks.
 ~Steve

On Friday 30 March 2007 16:30, user-ce96540ed38f@xymon.invalid wrote:
Lets say I have the following bb-hosts file:
page servers
subpage Web Web Servers
1.2.3.4		Web01.domain.com			#
1.2.3.5		Web02.domain.com			#
Subpage Other Other Web Servers
0.0.0.0		Web02.domain.com			#

And now I have the following hobbit-alerts.cfg:
PAGE=servers/Other SERVICE=conn
	MAIL user-f21f09270637@xymon.invalid COLOR=red,yellow

Now I run the command, "bin/bbcmd hobbitd_alert --test Web02.domain.com
conn", I see:
2007-03-30 15:09:38 Using default environment
file ..../server/etc/hobbitserver.cfg
00000897 2007-03-30 15:09:38 send_alert Web02.domain.com:conn state
Paging 00000897 2007-03-30 15:09:38 Matching
host:service:page 'Web02.domain.com:conn:server/Web' against rule line
119 00000897 2007-03-30 15:09:38 Failed 'PAGE=servers/Other SERVICE=conn'
(pagename not in include list)

So it seems when using PAGE= alert rules only honor the first page a
device is listed on. Is this by design ?