Xymon Mailing List Archive search

big brother replacement

list Josh Luthman
Fri, 2 Nov 2007 10:44:17 -0400
Message-Id: <user-05418cae5a23@xymon.invalid>

So I take it that Joe has to Paypal Henrik $64 now?

Please let me, and everyone else of course, know how the failover script
works on Hobbit.  I'd be very interested in knowing the result to this!

Thanks to all three of you!

On 11/2/07, Henrik Stoerner <user-ce4a2c883f75@xymon.invalid> wrote:
Hi Joe,

On Thu, Nov 01, 2007 at 03:20:12PM -0700, Sloan wrote:
So, the $64 question: Is there anything in hobbit, or on the horizon,
which will allow hobbit to serve as a drop-in replacement for bb,
including the failover capability?
The BB "failover" script does two things: It makes the network tests
run on the failover server if the primary BBNET server cannot be
ping'ed; and it enables alerts being sent from the failover server
if there is no connection from the failover server to the primary
BBPAGER server.


The network-test failover is fairly simple to do. I've attached two
scripts here, both of which must run on the backup/standby/failover
server:

1) failover.sh - goes in ~hobbit/server/ext/
   Add a section to hobbitlaunch.cfg with

      [failovercheck]
        ENVFILE /usr/lib/hobbit/server/etc/hobbitserver.cfg
        NEEDS hobbitd
        CMD $BBHOME/ext/failover.sh 10.0.0.1 hobbitnet.mydom.com

   "10.0.0.1" is the IP of your primary Hobbit server,
   "hobbitnet.mydom.com" is the hostname (in the bb-hosts file) of the
   primary network test machine.

   What this does is that it queries the primary Hobbit server for how
   long ago the network tests were updated. If more than 7 minutes ago
   it deems the primary network test node to be DOWN, and flags this via
   the file $BBTMP/primarynetDOWN. If the network test update was less
   than 7 minutes ago, it removes the file.

   This is then used by the other script, which replaces the CMD in the
   "[bbnet]" section in hobbitlaunch.cfg.

2) failovernet.sh - goes in ~hobbit/server/ext/
   When this runs to do the normal network tests, it will check for the
   presence of the $BBTMP/primarynetDOWN file. If this file exists, it
   picks up the IP of the primary Hobbit server from the file, and
   modifies the settings to report data to both the normal (local)
   Hobbit server, and to the primary server. If the file does not exist,
   it will just run the network tests the normal way.
   So to run this, modify the [bbnet] section in hobbitlaunch.cfg and
   change the CMD setting to "$BBHOME/server/ext/failovernet.sh"


The alert failover is different, because Hobbit doesn't have a separate
BBPAGER server - alerts are sent from the same host that handles the
Hobbit data collection and webpages. A solution to this has been
implemented for the next release, where the alerting module can be
distributed onto multiple servers, but only one of them will send alerts
at any given time.


Regards,
Henrik

-- 
Josh Luthman
Office: XXX-XXX-XXXX
Direct: XXX-XXX-XXXX
XXXX Wayne St
Suite XXXX
Troy, OH XXXXX

Those who don't understand UNIX are condemned to reinvent it, poorly.
--- Henry Spencer