Xymon Mailing List Archive search

Hobbit 4.0 RC3

12 messages in this thread

list Henrik Størner · Tue, 22 Feb 2005 23:49:00 +0100 ·
Another release candidate - 4.0 RC3 - is now available on Sourceforge.
There are a couple of outstanding bug reports related to alerts that
I would like to get a grip on before calling this an official release.

The list of changes - see below - is again rather long. Most notably,
hobbitd crashing because of a mis-setting of the MACHINE variable 
has been fixed, as well as the bbtest-net crashes that happened 
with the apache-test.

Also of note: Instaling Hobbit is now always done by running "make
install". The old "setup" target no longer exists; "make install" will
install everything, and even update your configuration files if new
settings have been added. Expect quite a few updates when you upgrade
from a previous version to 4.0-RC3, as I added all of the settings
needed by the BB client package (so that extension scripts have all 
of the environment variables they expect).


Regards,
Henrik


Changes from RC-2 -> RC-3

Configuration file changes:
* The bb-services file format was changed slightly.
  Instead of "service foo" to define a service, it is
  now "[foo]". Existing files will be converted 
  automatically by "make install".
  
* The name of the "conn" column (for ping-tests) is used
  throughout Hobbit, and had to be set in multiple locations.
  Changed all of them to use the setting from the PINGCOLUMN
  environment variable, and added this to hobbitserver.cfg.

* The --purple-conn option was dropped from hobbitd.
  It should be removed from hobbitlaunch.cfg.

* The --ping=COLUMNNAME option for bbtest-net should not
  be used any more. "--ping" enables the ping tests, the
  name of the column is taken from the PINGCOLUMN variable.

* The GRAPHS setting in hobbitserver.cfg no longer needs to
  have the simple TCP tests defined. These are automatically
  picked up from the bb-services file.

Bugfixes:
* hobbitd no longer crashes, if the MACHINE name from 
  hobbitserver.cfg is not listed in bb-hosts. Thanks to
  Anonymous for helping me track down this bug.

* If hobbitd crashed, then hobbitlaunch would attempt
  to restart it immediately. Added a 5 second delay,
  so that there's time for the OS to clean up any open
  sockets, files etc that might prevent a restart from
  working.

* The "disk" RRD handler could be confused by reports 
  from a Unix server, and mistake it for a report from a
  Windows server. This caused the report to try and store
  data in an RRD file with an invalid filename, so no
  graph-data was being stored.

* The "cpu" and "disk" RRD handlers were enhanced to support 
  reports from the "filerstats2bb" script for monitoring NetApp
  systems. The disk-handler also supports the "inode" and "qtree"
  reports from the same script.

* bb-services was overwritten by a "make install". This
  wiped out custom network test definitions.

* bbnet would crash if you happened to define a "http"
  or "https" test instead of using a full URL.

* bbnet was mis-calculating the size of the URL used for th
  apache-test. This could cause it to overflow a buffer and
  crash.

* hobbitd would ignore the BBPORT setting and always default
  to using port 1984.

* Portability problems on HP-UX 11 should be resolved. From
  reports it appears that building RRDtool on HP-UX 11 is 
  somewhat of a challenge; however, the core library is
  all that Hobbit needs, so build-problems with the Perl
  modules can be ignored as far as Hobbit is concerned.

* hobbitd_alert could not handle multiple recipients for scripts,
  and mistakenly assumed all recipients with a "@" were for
  e-mail recipients.

* Alert messages no longer include the "<!-- flags:... ->"
  summary; this is for Hobbit internal use only.

* "suse" and "mandrake" are recognized as aliases for "linux"
  in the RRD handler.


Improvements:
* The info-pages now list the Hobbit alert configuration.

* hobbitd_alert now has a "--trace=FILENAME" option. This 
  causes it to log a complete trace of all messages received
  from hobbitd, and how they are handled and what alerts get
  sent out as a result. This should help in tracking down
  alert problems.

* New FORMAT=PLAIN setting for alert recipients. This is the
  same as FORMAT=TEXT, except that the URL link to the status-
  page is left out of the message.

* The "setup" target for make has been removed. "make install"
  will now do all of the work, and will also merge in any
  added settings to the hobbitserver.cfg, hobbitgraph.cfg,
  hobbitlaunch.cfg, columndoc.csv and bb-services files.
  The standard files in ~/server/web/ and ~/server/www/ are
  also updated, if a previous version of the standard file
  is found.

* The graph included on a status view page can now be 
  zoomed directly, without having to go over the "view all
  period graphs" page.

* Color-names in hobbit-alerts.cfg are now case-insensitive.

* If the "acknowledge alert" webpage is password-protected,
  the login-username is now included in the acknowledge 
  message. This will also appear in the BB2 acknowledgement
  log display, and on the status page.

* More tips added to the "Tips & Tricks" document: How to get
  temperature graphs with Fahrenheit, how to configure Apache
  to allow viewing of the CGI man-pages.

* A native MD5 message-digest routine was added, so content-
  checks using digests will work even when Hobbit is built
  without OpenSSL support. The routine was taken from
  http://sourceforge.net/projects/libmd5-rfc/

* bb-findhost CGI will let you search for IP-adresses.

* The "--recentgifs" option to bbgen now has a parameter,
  so you can specify what the threshold is for a status to have
  changed "recently". The default is 24 hours.
list Henrik Størner · Tue, 22 Feb 2005 23:52:00 +0100 ·
quoted from Henrik Størner
On Tue, Feb 22, 2005 at 11:49:00PM +0100, Henrik Stoerner wrote:
Another release candidate - 4.0 RC3 - is now available on Sourceforge.
There are a couple of outstanding bug reports related to alerts that
I would like to get a grip on before calling this an official release.
One thing I forgot to mention: If you do run into problems with
alerts not happening as you expect them to, please add the
option "--trace=FILENAME" to the hobbitd_alert command in
hobbitlaunch.cfg. This causes all alert-activity to be logged to
the file you specify, and will make it much easier to figure out why
the code acted the way it did.


Henrik
list Bruce Lysik · Tue, 22 Feb 2005 15:14:42 -0800 ·
* The info-pages now list the Hobbit alert configuration.
This is an awesome feature!

One comment, the COLOR column doesn't seem to know if you set the --alertcolors different hobbitlaunch.cfg.  (I've set --alertcolors to purple,red, but the COLORS column says I'm alerting on purple,yellow,red.)

Hobbit is chugging along great.

--
Bruce Z. Lysik  <user-4e63a10f8934@xymon.invalid>
Operations Engineer
list Asif Iqbal · Wed, 23 Feb 2005 02:10:58 -0500 ·
quoted from Henrik Størner
On Tue, Feb 22, 2005 at 11:49:00PM, Henrik Stoerner wrote:
Another release candidate - 4.0 RC3 - is now available on Sourceforge.
bb-infocolumn.c: In function `generate_hobbit_alertinfo':
bb-infocolumn.c:110: error: `PATH_MAX' undeclared (first use in this
function)
bb-infocolumn.c:110: error: (Each undeclared identifier is reported only
once
bb-infocolumn.c:110: error: for each function it appears in.)
gmake[1]: *** [bb-infocolumn.o] Error 1
gmake[1]: Leaving directory `/usr/share/src/hobbit-4.0-RC3/bbdisplay'
gmake: *** [bbdisplay-build] Error 2
quoted from Henrik Størner
There are a couple of outstanding bug reports related to alerts that
I would like to get a grip on before calling this an official release.
[...]
-- 

Asif Iqbal
PGP Key: 0xE62693C5 KeyServer: pgp.mit.edu
"It is not the strongest of the species that survives, not the most intelligent, but
 the one most responsive to change."    - Charles Darwin
list Henrik Størner · Wed, 23 Feb 2005 08:17:04 +0100 ·
quoted from Asif Iqbal
On Wed, Feb 23, 2005 at 02:10:58AM -0500, Asif Iqbal wrote:
On Tue, Feb 22, 2005 at 11:49:00PM, Henrik Stoerner wrote:
Another release candidate - 4.0 RC3 - is now available on Sourceforge.
bb-infocolumn.c: In function `generate_hobbit_alertinfo':
bb-infocolumn.c:110: error: `PATH_MAX' undeclared (first use in this
function)
Oh, that's a silly one. Just add

 #include <limits.h>

next to all the other "#include..." lines near the top of that file.


Henrik
list Wim de Houwer · Wed, 23 Feb 2005 10:39:37 +0100 ·
Hey All,

I don't know if i've hit a bug but my situation is like this:

Extraction of hobbit-alerts.cfg

HOST=%B2B*
    SCRIPT /usr/local/pager/sms 1234567890 color=red format=sms
time=*:0700:2359 repeat=120m


It seems hobbit interprets as follows:

It takes every line and uses the first part of the line correctly but
does something wrong on the second part of the line, or it might just be
the user that's doing something wrong ...


Service	Recipient	1st Delay	Stop after	Repeat	Time of
Day	Colors
bgp	1234567890	-		-		30m 	-
red
	format=sms	-		-		30m 	-
purple,yellow,red
conn	1234567890	-		-		30m 	-
red
	format=sms	-		-		30m 	-
purple,yellow,red
mem	1234567890	-		-		30m 	-
red
	format=sms	-		-		30m 	-
purple,yellow,red

Anyone any suggestions ?

Cheers,

Wim
quoted from Henrik Størner


-----Original Message-----
From: Henrik Stoerner [mailto:user-ce4a2c883f75@xymon.invalid] 
Sent: woensdag 23 februari 2005 8:17
To: user-ae9b8668bcde@xymon.invalid
Subject: Re: [hobbit] Hobbit 4.0 RC3

On Wed, Feb 23, 2005 at 02:10:58AM -0500, Asif Iqbal wrote:
On Tue, Feb 22, 2005 at 11:49:00PM, Henrik Stoerner wrote:
Another release candidate - 4.0 RC3 - is now available on
Sourceforge.
bb-infocolumn.c: In function `generate_hobbit_alertinfo':
bb-infocolumn.c:110: error: `PATH_MAX' undeclared (first use in this
function)
Oh, that's a silly one. Just add

 #include <limits.h>

next to all the other "#include..." lines near the top of that file.


Henrik
list Henrik Størner · Wed, 23 Feb 2005 11:13:41 +0100 ·
quoted from Wim de Houwer
On Wed, Feb 23, 2005 at 10:39:37AM +0100, Wim De Houwer wrote:
Hey All,

I don't know if i've hit a bug but my situation is like this:

Extraction of hobbit-alerts.cfg

HOST=%B2B*
This matches hosts called "B2", "B2B", "B2BB", "B2BBB" etc.

You probably intended it as "HOST=%B2B.*"
quoted from Wim de Houwer
    SCRIPT /usr/local/pager/sms 1234567890 color=red format=sms
time=*:0700:2359 repeat=120m
It seems hobbit interprets as follows:

It takes every line and uses the first part of the line correctly but
does something wrong on the second part of the line
Yes this looks odd, and if I use your configuration it also looks odd
on my system. I'll get that straightened out.


Henrik
list Frédéric Mangeant · Wed, 23 Feb 2005 16:18:39 +0100 ·
Hi all

I'm still having some issues with Hobbit 4.0 RC3 (installed from scratch, on
a Gentoo Linux x86 up to date).
The main problem is that I can't disable a host using maint.pl: it just does
nothing, and I get this in my Apache error_log :

maint.pl: Use of uninitialized value in substitution (s///) at
/BB/hobbit/cgi-secure/maint.pl line 550., referer:
http://xx.xx.xx.xx/hobbit/

FYI, Perl was upgraded from perl-5.8.5-r4 to perl-5.8.5-r2, regarding 2
security alerts (CAN-2005-015{5,6}). I'll try to downgrade.
Changes from RC-2 -> RC-3
[...]
Improvements:
* The info-pages now list the Hobbit alert configuration.
With this paging rule :

HOST=foo SERVICE=* REPEAT=24h TIME=W:0900:1800 DURATION>5m SCRIPT
/tmp/alert.sh FORMAT=TEXT

I get this in the "Recipient" case : "FORMAT=TEXT"

Shouldn't it be the script name ?

Another small problem : I'm running bbgen with the option "
--infoupdate=300", and get this warning under the "bbgen" column :

Error output:
Unknown option : --infoupdate=300

There seems to be missing someting in the sources :

$ find /tmp/hobbit-4.0-RC3 | xargs grep infoupdate
./docs/manpages/man1/bbgen.1.html:<DT>--infoupdate=N<DD>
./bbdisplay/bbgen.1:.IP "--infoupdate=N"
./bbdisplay/bbgen.c:                    printf("    --infoupdate=N
: time between updates of INFO column pages in seconds\n");
quoted from Henrik Størner
* New FORMAT=PLAIN setting for alert recipients. This is the
  same as FORMAT=TEXT, except that the URL link to the status-
  page is left out of the message.
I'm still having a warning if "FORMAT=TEXT" is not specified in
hobbit-alerts.cfg :

Ignoring SCRIPT with no recipient at line 1
quoted from Henrik Størner
* The "--recentgifs" option to bbgen now has a parameter,
  so you can specify what the threshold is for a status to have
  changed "recently". The default is 24 hours.
Thanks for a lot for this one :-)

Regards,

-- 

Frédéric Mangeant
list Frédéric Mangeant · Wed, 23 Feb 2005 17:46:47 +0100 ·
Another issue : with this paging rule

HOST=foo SERVICE=* EXSERVICE=procs REPEAT=24h TIME=W:0900:1800 DURATION>5m
SCRIPT /tmp/alert.sh FORMAT=TEXT

I got paged every 30 minutes with a red "disk" column, instead of 24 hours.
The trace file contains this :

00014678 2005-02-23 16:22:23 Matching host:service:page
'foo:disk:supervision/hobbit' against rule line 1

00014678 2005-02-23 16:22:23 *** Match with 'HOST=foo SERVICE=*
EXSERVICE=procs REPEAT=24h TIME=W:0900:1800 DURATION>5m SCRIPT /tmp/alert.sh
FORMAT=TEXT COLOR=red,purple' ***

00014678 2005-02-23 16:22:23 Matching host:service:page
'foo:disk:supervision/hobbit' against rule line 1

00014678 2005-02-23 16:22:23 *** Match with 'HOST=foo SERVICE=*
EXSERVICE=procs REPEAT=24h TIME=W:0900:1800 DURATION>5m SCRIPT /tmp/alert.sh
FORMAT=TEXT COLOR=red,purple' ***

00000836 2005-02-23 16:22:23 send_alert foo:disk state Paging

00000836 2005-02-23 16:22:23 Matching host:service:page
'foo:disk:supervision/hobbit' against rule line 1

00000836 2005-02-23 16:22:23 *** Match with 'HOST=foo SERVICE=*
EXSERVICE=procs REPEAT=24h TIME=W:0900:1800 DURATION>5m SCRIPT /tmp/alert.sh
FORMAT=TEXT COLOR=red,purple' ***

00000836 2005-02-23 16:22:23 Matching host:service:page
'foo:disk:supervision/hobbit' against rule line 1

00000836 2005-02-23 16:22:23 *** Match with 'HOST=foo SERVICE=*
EXSERVICE=procs REPEAT=24h TIME=W:0900:1800 DURATION>5m SCRIPT /tmp/alert.sh
FORMAT=TEXT COLOR=red,purple' ***

00000836 2005-02-23 16:22:23 Script alert with command '/tmp/alert.sh' and
recipient FORMAT=TEXT

00014678 2005-02-23 16:22:23 Matching host:service:page
'foo:disk:supervision/hobbit' against rule line 1

00014678 2005-02-23 16:22:23 *** Match with 'HOST=foo SERVICE=*
EXSERVICE=procs REPEAT=24h TIME=W:0900:1800 DURATION>5m SCRIPT /tmp/alert.sh
FORMAT=TEXT COLOR=red,purple' ***

00014678 2005-02-23 16:22:23 Matching host:service:page
'foo:disk:supervision/hobbit' against rule line 1

00014678 2005-02-23 16:22:23 *** Match with 'HOST=foo SERVICE=*
EXSERVICE=procs REPEAT=24h TIME=W:0900:1800 DURATION>5m SCRIPT /tmp/alert.sh
FORMAT=TEXT COLOR=red,purple' ***

00014678 2005-02-23 16:22:27 @@page foo:disk:supervision/hobbit=red

-- 

Frédéric Mangeant
list Henrik Størner · Thu, 24 Feb 2005 00:02:49 +0100 ·
quoted from Frédéric Mangeant
On Wed, Feb 23, 2005 at 04:18:39PM +0100, Frédéric Mangeant wrote:
The main problem is that I can't disable a host using maint.pl: it just does
nothing, and I get this in my Apache error_log :

maint.pl: Use of uninitialized value in substitution (s///) at
/BB/hobbit/cgi-secure/maint.pl line 550., referer:
http://xx.xx.xx.xx/hobbit/
Others have reported the same, but it appears to be dependant on the
Perl version or configuration that is used. However, since this
particular bit of the maint.pl script is essentially "dead code" (it
was a preparation for some new feature in the Big Brother bbd daemon
which - as far as I know - was never implemented), I've ripped it out
now and that should hopefully take care of this problem.
quoted from Frédéric Mangeant

* The info-pages now list the Hobbit alert configuration.
With this paging rule :

HOST=foo SERVICE=* REPEAT=24h TIME=W:0900:1800 DURATION>5m SCRIPT
/tmp/alert.sh FORMAT=TEXT

I get this in the "Recipient" case : "FORMAT=TEXT"
It's a bug in RC-3. Should be fixed now.
quoted from Frédéric Mangeant

Another small problem : I'm running bbgen with the option "
--infoupdate=300", and get this warning under the "bbgen" column :

Error output:
Unknown option : --infoupdate=300
Both the --info and --info-update options are gone. I forgot to delete
them from the man-page, thanks for noticing.

If you want the info-column pages updated more frequently, change the
setting for the bb-infocolumn task in server/etc/hobbitlaunch.cfg


Henrik
list Henrik Størner · Thu, 24 Feb 2005 00:05:28 +0100 ·
quoted from Frédéric Mangeant
On Wed, Feb 23, 2005 at 05:46:47PM +0100, Frédéric Mangeant wrote:
Another issue : with this paging rule

HOST=foo SERVICE=* EXSERVICE=procs REPEAT=24h TIME=W:0900:1800 DURATION>5m
SCRIPT /tmp/alert.sh FORMAT=TEXT

I got paged every 30 minutes with a red "disk" column, instead of 24
hours.
You cannot set a REPEAT setting on a rule, it goes on the recipient.
And a SCRIPT recipient needs both the script name and a parameter.
So your alert-config should be

HOST=foo SERVICE=* EXSERVICE=procs TIME=W:0900:1800 DURATION>5m
   SCRIPT /tmp/alert.sh somerecipient FORMAT=TEXT REPEAT=24h


Henrik
list Henrik Størner · Sun, 27 Feb 2005 17:10:49 +0100 ·
quoted from Bruce Lysik
On Tue, Feb 22, 2005 at 03:14:42PM -0800, Bruce Lysik wrote:
* The info-pages now list the Hobbit alert configuration.
This is an awesome feature!

One comment, the COLOR column doesn't seem to know if you set the
 --alertcolors different hobbitlaunch.cfg.  (I've set --alertcolors
 to purple,red, but the COLORS column says I'm alerting on
 purple,yellow,red.)
bb-infocolumn that generates these webpages need to be told that
you've changed the default alert-colors. Starting with RC4, it
will now accept the same "--alertcolors" option as hobbitd_alert,
and show you the correct setting.


Henrik