Xymon Mailing List Archive search

big brother replacement

45 messages in this thread

list Joe Sloan · Thu, 01 Nov 2007 15:20:12 -0700 ·
Hello list,

It's that time of year again - we're looking for alternatives to our
aging bb infrastructure - although it's been helped by the bbgen
extensions, it is showing it's age, and is getting harder to support as
time goes by.

Of all the potential replacements we've looked at, I don't really like
any of them - the commercial bb stuff is uninspiring, and their linux
support is lacking. The other solutions tend to be heavyweight j2ee and
database apps, or oddities like nagios. What I'd really love to find is
something like an up-to-date version of big brother+bbgen, something
like hobbit.

Unfortunately, last I checked, hobbit still lacked a crucial capability
that we depend on, the built-in bb failover mechanism. We have 2 data
centers, several hundred miles apart, with bb servers in several lans at
both sites. Each bb server has a twin at the other location, and they
both monitor the servers in both data centers, but only one of the bb
servers does reporting, as determined by the failover state. The bb
failover has worked marvelously, and has kept bb firmly in place so far,
despite the other advantages of hobbit.

So, the $64 question: Is there anything in hobbit, or on the horizon,
which will allow hobbit to serve as a drop-in replacement for bb,
including the failover capability?

Thanks for your words of wisdom.

Joe
list Tod Hansmann · Thu, 1 Nov 2007 16:33:48 -0600 ·
Let me see if I understand.  You have several bb servers at one
datacenter, each with their twin at the other datacenter, and both sets
do the tests.  They report to one central display server, but only one
set reports at a time, depending on failover state, correct?  

Is this failover automatic?  If so, how is this failover determined?
What if this failover has a false positive?  If not, what is your
timeframe to swap over?

Tod Hansmann
Network Engineer
quoted from Joe Sloan
 
 
-----Original Message-----
From: Sloan [mailto:user-b1d2c84d244b@xymon.invalid] 
Sent: Thursday, November 01, 2007 4:20 PM
To: user-ae9b8668bcde@xymon.invalid
Subject: [hobbit] big brother replacement

Hello list,

It's that time of year again - we're looking for alternatives to our
aging bb infrastructure - although it's been helped by the bbgen
extensions, it is showing it's age, and is getting harder to support as
time goes by.

Of all the potential replacements we've looked at, I don't really like
any of them - the commercial bb stuff is uninspiring, and their linux
support is lacking. The other solutions tend to be heavyweight j2ee and
database apps, or oddities like nagios. What I'd really love to find is
something like an up-to-date version of big brother+bbgen, something
like hobbit.

Unfortunately, last I checked, hobbit still lacked a crucial capability
that we depend on, the built-in bb failover mechanism. We have 2 data
centers, several hundred miles apart, with bb servers in several lans at
both sites. Each bb server has a twin at the other location, and they
both monitor the servers in both data centers, but only one of the bb
servers does reporting, as determined by the failover state. The bb
failover has worked marvelously, and has kept bb firmly in place so far,
despite the other advantages of hobbit.

So, the $64 question: Is there anything in hobbit, or on the horizon,
which will allow hobbit to serve as a drop-in replacement for bb,
including the failover capability?

Thanks for your words of wisdom.

Joe
list Josh Luthman · Thu, 1 Nov 2007 18:35:21 -0400 ·
I'm not entire sure what you mean when you reference the failover
capability.  Could you please explain how this works?  I'm interested in
knowing how the hostname reflects to what IP addresses, hardware running
what software specifically, etc.  Coming from BB1.9btf I don't know of many
expansions between 1.9 and 3.3.

We had some discussion about multiple servers and redundancy just a short
while ago:
http://www.hswn.dk/hobbiton/2007/10/msg00423.html
quoted from Tod Hansmann

On 11/1/07, Sloan <user-b1d2c84d244b@xymon.invalid> wrote:
Hello list,

It's that time of year again - we're looking for alternatives to our
aging bb infrastructure - although it's been helped by the bbgen
extensions, it is showing it's age, and is getting harder to support as
time goes by.

Of all the potential replacements we've looked at, I don't really like
any of them - the commercial bb stuff is uninspiring, and their linux
support is lacking. The other solutions tend to be heavyweight j2ee and
database apps, or oddities like nagios. What I'd really love to find is
something like an up-to-date version of big brother+bbgen, something
like hobbit.

Unfortunately, last I checked, hobbit still lacked a crucial capability
that we depend on, the built-in bb failover mechanism. We have 2 data
centers, several hundred miles apart, with bb servers in several lans at
both sites. Each bb server has a twin at the other location, and they
both monitor the servers in both data centers, but only one of the bb
servers does reporting, as determined by the failover state. The bb
failover has worked marvelously, and has kept bb firmly in place so far,
despite the other advantages of hobbit.

So, the $64 question: Is there anything in hobbit, or on the horizon,
which will allow hobbit to serve as a drop-in replacement for bb,
including the failover capability?

Thanks for your words of wisdom.

Joe

-- 

Josh Luthman
Office: XXX-XXX-XXXX
Direct: XXX-XXX-XXXX
XXXX Wayne St
Suite XXXX
Troy, OH XXXXX

Those who don't understand UNIX are condemned to reinvent it, poorly.
--- Henry Spencer
list Joe Sloan · Thu, 01 Nov 2007 16:02:57 -0700 ·
quoted from Tod Hansmann
Tod Hansmann wrote:
Let me see if I understand.  You have several bb servers at one
datacenter, each with their twin at the other datacenter, and both sets
do the tests.  They report to one central display server, but only one
set reports at a time, depending on failover state, correct?  
  
You have the basic idea, but there is no single central server, just
pairs of bb servers, one to a data center, in each lan which is being
monitored. For each pair of bb servers, only the server at data center A
does reporting, unless the server in data center B cannot reach the
server in data center A, in which case the server in data center B will
take over the reporting duties until the bb server in data center A
becomes reachable again. While this could theoretically lead to a split
brain condition, the failover condition has only ever triggered when
there was a wan outage.
quoted from Tod Hansmann
Is this failover automatic?  If so, how is this failover determined?
What if this failover has a false positive?  If not, what is your
timeframe to swap over?
  
IIRC It takes one bb cycle to kick in.

We've not seen a false positive, as I mentioned above.

It's just the standard built-in bb failover -

head ~bb/ext/failover follows:

#!/bin/sh

# failover
#
# BIG BROTHER - FAILOVER SCRIPT
# Sean MacGuire
#
# (c) Copyright Quest Software, Inc.  1997-2003  All rights reserved.
#

#
# failover WATCHES BBNET and BBPAGER
#
# IF BBNET OR BBPAGER BECOMES UNAVAILABLE, THEN TAKE OVER UNTIL THEY RETURN
#
# To use, just add failover to the BBEXT variable in etc/bbdef.sh
#
# To configure BBPAGER failover:
# define both the primary and failover machines as BBPAGERS in etc/bb-hosts
# and set bbwarn: FAILOVER in etc/bbwarnsetup.cfg


Joe
list Josh Luthman · Thu, 1 Nov 2007 19:12:21 -0400 ·
I see what you're saying, but you still have to manually specify which
server you're connecting to.  If the bb1.domain.tld can not be reached the
techs have to manually enter bb2.domain.tld - correct?

I know of several BB ext scripts that work perfectly with Hobbit and even
more then just needed a small weak.  Would you be able to post the entire
ext script?  Hopefully Henrik is willing to answer your $64 question =)

Josh
quoted from Joe Sloan

On 11/1/07, Sloan <user-b1d2c84d244b@xymon.invalid> wrote:
Tod Hansmann wrote:
Let me see if I understand.  You have several bb servers at one
datacenter, each with their twin at the other datacenter, and both sets
do the tests.  They report to one central display server, but only one
set reports at a time, depending on failover state, correct?
You have the basic idea, but there is no single central server, just
pairs of bb servers, one to a data center, in each lan which is being
monitored. For each pair of bb servers, only the server at data center A
does reporting, unless the server in data center B cannot reach the
server in data center A, in which case the server in data center B will
take over the reporting duties until the bb server in data center A
becomes reachable again. While this could theoretically lead to a split
brain condition, the failover condition has only ever triggered when
there was a wan outage.
Is this failover automatic?  If so, how is this failover determined?
What if this failover has a false positive?  If not, what is your
timeframe to swap over?
IIRC It takes one bb cycle to kick in.

We've not seen a false positive, as I mentioned above.

It's just the standard built-in bb failover -

head ~bb/ext/failover follows:

#!/bin/sh

# failover
#
# BIG BROTHER - FAILOVER SCRIPT
# Sean MacGuire
#
# (c) Copyright Quest Software, Inc.  1997-2003  All rights reserved.
#

#
# failover WATCHES BBNET and BBPAGER
#
# IF BBNET OR BBPAGER BECOMES UNAVAILABLE, THEN TAKE OVER UNTIL THEY
RETURN
#
# To use, just add failover to the BBEXT variable in etc/bbdef.sh
#
# To configure BBPAGER failover:
# define both the primary and failover machines as BBPAGERS in
etc/bb-hosts
# and set bbwarn: FAILOVER in etc/bbwarnsetup.cfg


Joe

-- 
Josh Luthman
Office: XXX-XXX-XXXX
Direct: XXX-XXX-XXXX
XXXX Wayne St
Suite XXXX
Troy, OH XXXXX

Those who don't understand UNIX are condemned to reinvent it, poorly.
--- Henry Spencer
list Joe Sloan · Thu, 01 Nov 2007 16:14:49 -0700 ·
quoted from Josh Luthman
Josh Luthman wrote:
I'm not entire sure what you mean when you reference the failover
capability.  Could you please explain how this works?  
I'm basically just a user of the capability, how much detail do you
want? It's just the standard failover built into bb.
quoted from Josh Luthman
I'm interested in knowing how the hostname reflects to what IP
addresses, hardware running what software specifically, etc.
How the hostname reflects to what IP address? I'm not sure what you
mean. There are no tricks here, just the standard dns naming scheme.

I'm not sure what hardware has to do with it, but we're running SLES on
HP/Compaq DL servers.

I'm not sure what you mean by "what software" - you mean the OS, or the
applications being monitored, or the exact version of bb?
Coming from BB1.9btf I don't know of many expansions between 1.9 and 3.3.
We're on 1.9 here, patched with bbgen to keep it going - AFAIK the bb
code has basically languished since it was bought back in 2001 or so, so
I'm curious about this version 3.3 that you speak of.
quoted from Josh Luthman

We had some discussion about multiple servers and redundancy just a
short while ago:
http://www.hswn.dk/hobbiton/2007/10/msg00423.html
Yes, those discussions look mostly like the typical ha requirements,
e.g. managing bb failover via external proxies, redirectors etc, which
adds a whole new layer of cost and complexity. It would be a hard sell
to justify all the new ha infrastructure if we are replacing bb with
something newer and better, since bb currently handles that all by
itself, with no need of an external ha system.

Joe
list Tod Hansmann · Thu, 1 Nov 2007 17:17:22 -0600 ·
I'd be using Henrik's solution as follows, given your situation:

"I run two completely separate systems in parallel, and have the clients
report to both of them. The system at our disaster center has the paging
module disabled (just disable the [bbpage] section in hobbitlaunch.cfg),
to avoid double alerts - it is simple to activate it, if necessary.

"Config files are rsync'ed from the primary site to the disaster site
regularly."


Though to be honest, this failover script may be something that can be
converted over to be used in hobbit.  You might be better off going one
of a dozen different options that are slightly different than how you
have it setup, but that's up to you.  

Hobbit doesn't have this built-in.  That's for sure.  I would think it's
fairly easy to use it to get much the same effect, though.  I'll wait
for others responses on your situation and throw my own thoughts back in
tomorrow morning.
quoted from Josh Luthman

Tod Hansmann
Network Engineer
 
 
-----Original Message-----
From: Sloan [mailto:user-b1d2c84d244b@xymon.invalid] 
Sent: Thursday, November 01, 2007 5:03 PM
To: user-ae9b8668bcde@xymon.invalid
Subject: Re: [hobbit] big brother replacement

Tod Hansmann wrote:
Let me see if I understand.  You have several bb servers at one
datacenter, each with their twin at the other datacenter, and both
sets
do the tests.  They report to one central display server, but only one
set reports at a time, depending on failover state, correct?  
  
You have the basic idea, but there is no single central server, just
pairs of bb servers, one to a data center, in each lan which is being
monitored. For each pair of bb servers, only the server at data center A
does reporting, unless the server in data center B cannot reach the
server in data center A, in which case the server in data center B will
take over the reporting duties until the bb server in data center A
becomes reachable again. While this could theoretically lead to a split
brain condition, the failover condition has only ever triggered when
there was a wan outage.
Is this failover automatic?  If so, how is this failover determined?
What if this failover has a false positive?  If not, what is your
timeframe to swap over?
  
IIRC It takes one bb cycle to kick in.

We've not seen a false positive, as I mentioned above.

It's just the standard built-in bb failover -

head ~bb/ext/failover follows:

#!/bin/sh

# failover
#
# BIG BROTHER - FAILOVER SCRIPT
# Sean MacGuire
#
# (c) Copyright Quest Software, Inc.  1997-2003  All rights reserved.
#

#
# failover WATCHES BBNET and BBPAGER
#
# IF BBNET OR BBPAGER BECOMES UNAVAILABLE, THEN TAKE OVER UNTIL THEY
RETURN
#
# To use, just add failover to the BBEXT variable in etc/bbdef.sh
#
# To configure BBPAGER failover:
# define both the primary and failover machines as BBPAGERS in
etc/bb-hosts
# and set bbwarn: FAILOVER in etc/bbwarnsetup.cfg


Joe
list Tod Hansmann · Thu, 1 Nov 2007 17:18:36 -0600 ·
I think he's just looking for the alerts.  From what he's indicating, it
doesn't look like he's too concerned about the display (unless he has a
bunch of web pages up at once all the time).

 
Tod Hansmann

Network Engineer

 
  <http://www.directpointe.com/>;  
quoted from Josh Luthman


From: Josh Luthman [mailto:user-4c45a83f15cb@xymon.invalid] 
Sent: Thursday, November 01, 2007 5:12 PM
To: user-ae9b8668bcde@xymon.invalid
Subject: Re: [hobbit] big brother replacement

 
I see what you're saying, but you still have to manually specify which
server you're connecting to.  If the bb1.domain.tld can not be reached
the techs have to manually enter bb2.domain.tld - correct?

I know of several BB ext scripts that work perfectly with Hobbit and
even more then just needed a small weak.  Would you be able to post the
entire ext script?  Hopefully Henrik is willing to answer your $64
question =) 

Josh

On 11/1/07, Sloan <user-b1d2c84d244b@xymon.invalid> wrote:

Tod Hansmann wrote:
Let me see if I understand.  You have several bb servers at one
datacenter, each with their twin at the other datacenter, and both
sets
do the tests.  They report to one central display server, but only one
set reports at a time, depending on failover state, correct?
You have the basic idea, but there is no single central server, just
pairs of bb servers, one to a data center, in each lan which is being 
monitored. For each pair of bb servers, only the server at data center A
does reporting, unless the server in data center B cannot reach the
server in data center A, in which case the server in data center B will 
take over the reporting duties until the bb server in data center A
becomes reachable again. While this could theoretically lead to a split
brain condition, the failover condition has only ever triggered when
there was a wan outage.
Is this failover automatic?  If so, how is this failover determined?
What if this failover has a false positive?  If not, what is your
timeframe to swap over?
IIRC It takes one bb cycle to kick in.

We've not seen a false positive, as I mentioned above.

It's just the standard built-in bb failover -

head ~bb/ext/failover follows:

#!/bin/sh

# failover
#
# BIG BROTHER - FAILOVER SCRIPT
# Sean MacGuire
#
# (c) Copyright Quest Software, Inc.  1997-2003  All rights reserved.
#

#
# failover WATCHES BBNET and BBPAGER
#
# IF BBNET OR BBPAGER BECOMES UNAVAILABLE, THEN TAKE OVER UNTIL THEY
RETURN 
#
# To use, just add failover to the BBEXT variable in etc/bbdef.sh
#
# To configure BBPAGER failover:
# define both the primary and failover machines as BBPAGERS in
etc/bb-hosts
# and set bbwarn: FAILOVER in etc/bbwarnsetup.cfg 


Joe


-- 
Josh Luthman
Office: XXX-XXX-XXXX
Direct: XXX-XXX-XXXX
XXXX Wayne St
Suite XXXX
Troy, OH XXXXX

Those who don't understand UNIX are condemned to reinvent it, poorly.
--- Henry Spencer
list Joe Sloan · Thu, 01 Nov 2007 16:22:48 -0700 ·
quoted from Josh Luthman
Josh Luthman wrote:
I see what you're saying, but you still have to manually specify
which server you're connecting to.  If the bb1.domain.tld can not be
reached the techs have to manually enter bb2.domain.tld - correct?
Well, no need to enter anything manually. We have on each big brother
server a page with links to all the other bb servers, and I'm sure the
support folks have links bookmarked, so if the DC1 data center is
down, then instead of clicking on e.g. dc1bbdata, they'd click on
dc2bbdata. Normally the 2 bb display servers in each pair provide an
identical view, so this only matters in the event of an outage.
quoted from Tod Hansmann
I know of several BB ext scripts that work perfectly with Hobbit and
even more then just needed a small weak.  Would you be able to post
the entire ext script?  Hopefully Henrik is willing to answer your
$64 question =)
Sure, I can post the entire script - it's in the attachment -

Joe
list Josh Luthman · Thu, 1 Nov 2007 19:33:29 -0400 ·
I see now - you've a redundant BBNET.  I haven't used BB in several weeks
and I never got really complex with it - a lot of ping tests was what I
needed out of it.  Can you explain what BBNET is?
quoted from Joe Sloan

On 11/1/07, Sloan <user-b1d2c84d244b@xymon.invalid> wrote:
Josh Luthman wrote:
I see what you're saying, but you still have to manually specify
which server you're connecting to.  If the bb1.domain.tld can not be
reached the techs have to manually enter bb2.domain.tld - correct?
Well, no need to enter anything manually. We have on each big brother
server a page with links to all the other bb servers, and I'm sure the
support folks have links bookmarked, so if the DC1 data center is
down, then instead of clicking on e.g. dc1bbdata, they'd click on
dc2bbdata. Normally the 2 bb display servers in each pair provide an
identical view, so this only matters in the event of an outage.
I know of several BB ext scripts that work perfectly with Hobbit and
even more then just needed a small weak.  Would you be able to post
the entire ext script?  Hopefully Henrik is willing to answer your
$64 question =)
Sure, I can post the entire script - it's in the attachment -

Joe

-- 
Josh Luthman
Office: XXX-XXX-XXXX
Direct: XXX-XXX-XXXX
XXXX Wayne St
Suite XXXX
Troy, OH XXXXX

Those who don't understand UNIX are condemned to reinvent it, poorly.
--- Henry Spencer
list Joe Sloan · Thu, 01 Nov 2007 16:39:29 -0700 ·
Well, bb being somewhat modular, there are 3 components: BBNET, BBPAGER,
BBDISPLAY

BBNET is the component of bb that tests the network
services/connectivity on the remote hosts.

Joe
quoted from Josh Luthman

Josh Luthman wrote:
I see now - you've a redundant BBNET.  I haven't used BB in several
weeks and I never got really complex with it - a lot of ping tests was
what I needed out of it.  Can you explain what BBNET is?

On 11/1/07, *Sloan* <user-b1d2c84d244b@xymon.invalid <mailto:user-b1d2c84d244b@xymon.invalid>> wrote:

    Josh Luthman wrote:
I see what you're saying, but you still have to manually specify
which server you're connecting to.  If the bb1.domain.tld can not be
reached the techs have to manually enter bb2.domain.tld - correct?
    Well, no need to enter anything manually. We have on each big brother
    server a page with links to all the other bb servers, and I'm sure the
    support folks have links bookmarked, so if the DC1 data center is
    down, then instead of clicking on e.g. dc1bbdata, they'd click on
    dc2bbdata. Normally the 2 bb display servers in each pair provide an
    identical view, so this only matters in the event of an outage.
I know of several BB ext scripts that work perfectly with Hobbit and
even more then just needed a small weak.  Would you be able to post
the entire ext script?  Hopefully Henrik is willing to answer your
$64 question =)
    Sure, I can post the entire script - it's in the attachment -

    Joe


-- 
Josh Luthman
Office: XXX-XXX-XXXX
Direct: XXX-XXX-XXXX
XXXX Wayne St
Suite XXXX
Troy, OH XXXXX

Those who don't understand UNIX are condemned to reinvent it, poorly.
--- Henry Spencer
list Josh Luthman · Thu, 1 Nov 2007 19:57:05 -0400 ·
That would be relative to Hobbit's bbtest I believe - someone correct me if
I'm wrong, I'm just guessing here!

I don't see any reason to think that script wouldn't work with some variable
changes like BBNET=`$GREP BBNET $BBHOSTS | $GREP "^[0-9]" | $GREP -v "^\#"`
but I am not an expert by any means!

Getting back to the version 3.3 - after 1.9btf Quest starting selling the
product.  I don't know the exact history behind it but 1.9btf is what you
get without paying for anything.  I have worked with and continue to monitor
a network with 3.1 or 3.2 (they decided to revert from 3.3 as it looks quite
a bit different on the BBDISPLAY) and I honestly don't see what they've
changed between 1.9 and 3.2.  Most of the features of BB I heard of or read
were not only already in Hobbit but were even better then what I had heard.
Not to mention the dozens of BB scripts that can be relatively painless to
migrate.
quoted from Joe Sloan

Josh

On 11/1/07, Sloan <user-b1d2c84d244b@xymon.invalid> wrote:
Well, bb being somewhat modular, there are 3 components: BBNET, BBPAGER,
BBDISPLAY

BBNET is the component of bb that tests the network
services/connectivity on the remote hosts.

Joe

Josh Luthman wrote:
I see now - you've a redundant BBNET.  I haven't used BB in several
weeks and I never got really complex with it - a lot of ping tests was
what I needed out of it.  Can you explain what BBNET is?

On 11/1/07, *Sloan* <user-b1d2c84d244b@xymon.invalid <mailto:user-b1d2c84d244b@xymon.invalid>> wrote:

    Josh Luthman wrote:
I see what you're saying, but you still have to manually specify
which server you're connecting to.  If the bb1.domain.tld can not
be
reached the techs have to manually enter bb2.domain.tld - correct?
    Well, no need to enter anything manually. We have on each big
brother
    server a page with links to all the other bb servers, and I'm sure
the
    support folks have links bookmarked, so if the DC1 data center is
    down, then instead of clicking on e.g. dc1bbdata, they'd click on
    dc2bbdata. Normally the 2 bb display servers in each pair provide an
    identical view, so this only matters in the event of an outage.
I know of several BB ext scripts that work perfectly with Hobbit
and
even more then just needed a small weak.  Would you be able to
post
the entire ext script?  Hopefully Henrik is willing to answer your
$64 question =)
    Sure, I can post the entire script - it's in the attachment -

    Joe


--
Josh Luthman
Office: XXX-XXX-XXXX
Direct: XXX-XXX-XXXX
XXXX Wayne St
Suite XXXX
Troy, OH XXXXX

Those who don't understand UNIX are condemned to reinvent it, poorly.
--- Henry Spencer
-- 
Josh Luthman
Office: XXX-XXX-XXXX
Direct: XXX-XXX-XXXX
XXXX Wayne St
Suite XXXX
Troy, OH XXXXX

Those who don't understand UNIX are condemned to reinvent it, poorly.
--- Henry Spencer
list Joe Sloan · Thu, 01 Nov 2007 17:05:03 -0700 ·
quoted from Josh Luthman
Josh Luthman wrote:
That would be relative to Hobbit's bbtest I believe - someone correct
me if I'm wrong, I'm just guessing here!

I don't see any reason to think that script wouldn't work with some
variable changes like BBNET=`$GREP BBNET $BBHOSTS | $GREP "^[0-9]" |
$GREP -v "^\#"` but I am not an expert by any means!

Well, depending on what I hear on this list in the next day or so,
taking a crack at adapting the old bb failover script may be my best
option.
quoted from Josh Luthman
Getting back to the version 3.3 - after 1.9btf Quest starting selling
the product.  I don't know the exact history behind it but 1.9btf is
what you get without paying for anything.  I have worked with and
continue to monitor a network with 3.1 or 3.2 (they decided to revert
from 3.3 as it looks quite a bit different on the BBDISPLAY) and I
honestly don't see what they've changed between 1.9 and 3.2.  Most of
the features of BB I heard of or read were not only already in Hobbit
but were even better then what I had heard.  Not to mention the dozens
of BB scripts that can be relatively painless to migrate.
Ah, interesting - I always had the feeling that quest didn't do much of
anything with the code except put in some verbiage and legal warnings,
and tried to push their own proprietary and non linux-friendly stuff and
left the bb code base to slowly decay.

Joe
list Josh Luthman · Thu, 1 Nov 2007 21:22:50 -0400 ·
You think their lack of linux support is bad?  Get a quote.
quoted from Joe Sloan

On 11/1/07, Sloan <user-b1d2c84d244b@xymon.invalid> wrote:
Josh Luthman wrote:
That would be relative to Hobbit's bbtest I believe - someone correct
me if I'm wrong, I'm just guessing here!

I don't see any reason to think that script wouldn't work with some
variable changes like BBNET=`$GREP BBNET $BBHOSTS | $GREP "^[0-9]" |
$GREP -v "^\#"` but I am not an expert by any means!

Well, depending on what I hear on this list in the next day or so,
taking a crack at adapting the old bb failover script may be my best
option.
Getting back to the version 3.3 - after 1.9btf Quest starting selling
the product.  I don't know the exact history behind it but 1.9btf is
what you get without paying for anything.  I have worked with and
continue to monitor a network with 3.1 or 3.2 (they decided to revert
from 3.3 as it looks quite a bit different on the BBDISPLAY) and I
honestly don't see what they've changed between 1.9 and 3.2.  Most of
the features of BB I heard of or read were not only already in Hobbit
but were even better then what I had heard.  Not to mention the dozens
of BB scripts that can be relatively painless to migrate.
Ah, interesting - I always had the feeling that quest didn't do much of
anything with the code except put in some verbiage and legal warnings,
and tried to push their own proprietary and non linux-friendly stuff and
left the bb code base to slowly decay.

Joe

-- 
Josh Luthman
Office: XXX-XXX-XXXX
Direct: XXX-XXX-XXXX
XXXX Wayne St
Suite XXXX
Troy, OH XXXXX

Those who don't understand UNIX are condemned to reinvent it, poorly.
--- Henry Spencer
list Mark Deiss · Fri, 2 Nov 2007 05:45:30 -0500 ·
For a vanilla BB environment, you can have multiple BBDISPLAY entities but
the recommendation is that there is only one BBNET entity. A BBNET server
that is generating the pings out to the clients will be sending the ping
results to all of the BBDISPLAY entities (as defined on the BBNET host). If
you have multiple BBNET entities that ping the same servers, you will be
sending duplicated results as far as the individual BBDISPLAY servers are
concerned (the connection messages will be renamed to the host being
pinged). To support multiple BBNETs in a non-race environment requires
additional coding to carefully direct the BBNET results to not trip over
each other. The default behavior is to pump them out to whatever BBDISPLAY
is listed - you get the race conditions when you want all the BBDISPLAY
servers to monitor all of the BBNET hosts (i.e. want BBNET to send their
client-side tests to the BBDISPLAY entities - this will result in the BBNET
poll messages going out to all the BBDISPLAY entities also).

It's doable, hacked the heck out of the BB code base years ago to support
three separate BBDISPLAY/BBNET servers that provided redundant monitoring
over a client base and over each other. The goal in this case is the BBNET
directs its status messages to only a defined group of BBDISPLAY servers -
that will only have the one source of BBNET message traffic. The rest of the
BBNET's tests would be allowed to go to a wider distribution of BBDISPLAY
servers. These other tests would be keyed to the BBNET's server name so
there would not be a race or conflict conditions occurring on the BBDISPLAY
servers. 

The BBNET race condition may seem minor - but then think about what is going
on with any RRD database entries - you would be updating if from all the
BBNET entities in a given time window. Resulting trends can get really
bizarre if the BBNET polls are originating from different network segments
with different response times. Bad times from one segment getting masked in
the trends due to updates occurring from another segment etc.

Main difference in the commercial version of BB over the BTF version is that
they added support for encrypting the communications from the clients to the
servers. I would place some value on that as some sites are running external
tests that are sending sensitive client information to the BBDISPLAY boxes.
There may be a difference in the level of included documentation - but who
reads the documentation anyways.....

Maybe Mr. Croteau and Mr. MacGuire will do a LBO and take BB private again.
quoted from Josh Luthman

-----Original Message-----
From: Sloan [mailto:user-b1d2c84d244b@xymon.invalid] 
Sent: Thursday, November 01, 2007 8:05 PM
To: user-ae9b8668bcde@xymon.invalid
Subject: Re: [hobbit] big brother replacement

Josh Luthman wrote:
That would be relative to Hobbit's bbtest I believe - someone correct
me if I'm wrong, I'm just guessing here!

I don't see any reason to think that script wouldn't work with some
variable changes like BBNET=`$GREP BBNET $BBHOSTS | $GREP "^[0-9]" |
$GREP -v "^\#"` but I am not an expert by any means!

Well, depending on what I hear on this list in the next day or so,
taking a crack at adapting the old bb failover script may be my best
option.
Getting back to the version 3.3 - after 1.9btf Quest starting selling
the product.  I don't know the exact history behind it but 1.9btf is
what you get without paying for anything.  I have worked with and
continue to monitor a network with 3.1 or 3.2 (they decided to revert
from 3.3 as it looks quite a bit different on the BBDISPLAY) and I
honestly don't see what they've changed between 1.9 and 3.2.  Most of
the features of BB I heard of or read were not only already in Hobbit
but were even better then what I had heard.  Not to mention the dozens
of BB scripts that can be relatively painless to migrate.
Ah, interesting - I always had the feeling that quest didn't do much of
anything with the code except put in some verbiage and legal warnings,
and tried to push their own proprietary and non linux-friendly stuff and
left the bb code base to slowly decay.

Joe
list Paul Williamson · Fri, 02 Nov 2007 08:30:38 -0400 ·
The biggest problem I have with going to Hobbit is there is no snmp trap sending support.  We don't use BB as our main interface for showing all alerts, but it is required and we do send snmp traps from the BBPAGER to our main dashboard of alerts.  Is this functionality in Hobbit yet?

************************************
This email may contain privileged and/or confidential information that is intended solely for the use of the addressee.  If you are not the intended recipient or entity, you are strictly prohibited from disclosing, copying, distributing or using any of the information contained in the transmission.  If you received this communication in error, please contact the sender immediately and destroy the material in its entirety, whether electronic or hard copy.  This communication may contain nonpublic personal information about consumers subject to the restrictions of the Gramm-Leach-Bliley Act and the Sarbanes-Oxley Act.  You may not directly or indirectly reuse or disclose such information for any purpose other than to provide the services for which you are receiving the information.
There are risks associated with the use of electronic transmission.  The sender of this information does not control the method of transmittal or service providers and assumes no duty or obligation for the security, receipt, or third party interception of this transmission.
************************************
list Henrik Størner · Fri, 2 Nov 2007 15:37:20 +0100 ·
Hi Joe,
quoted from Josh Luthman

On Thu, Nov 01, 2007 at 03:20:12PM -0700, Sloan wrote:
So, the $64 question: Is there anything in hobbit, or on the horizon,
which will allow hobbit to serve as a drop-in replacement for bb,
including the failover capability?
The BB "failover" script does two things: It makes the network tests 
run on the failover server if the primary BBNET server cannot be
ping'ed; and it enables alerts being sent from the failover server
if there is no connection from the failover server to the primary
BBPAGER server.


The network-test failover is fairly simple to do. I've attached two
scripts here, both of which must run on the backup/standby/failover 
server:

1) failover.sh - goes in ~hobbit/server/ext/
   Add a section to hobbitlaunch.cfg with

      [failovercheck]
	ENVFILE /usr/lib/hobbit/server/etc/hobbitserver.cfg
	NEEDS hobbitd
	CMD $BBHOME/ext/failover.sh 10.0.0.1 hobbitnet.mydom.com

   "10.0.0.1" is the IP of your primary Hobbit server,
   "hobbitnet.mydom.com" is the hostname (in the bb-hosts file) of the
   primary network test machine.

   What this does is that it queries the primary Hobbit server for how
   long ago the network tests were updated. If more than 7 minutes ago
   it deems the primary network test node to be DOWN, and flags this via
   the file $BBTMP/primarynetDOWN. If the network test update was less
   than 7 minutes ago, it removes the file.

   This is then used by the other script, which replaces the CMD in the
   "[bbnet]" section in hobbitlaunch.cfg.

2) failovernet.sh - goes in ~hobbit/server/ext/
   When this runs to do the normal network tests, it will check for the 
   presence of the $BBTMP/primarynetDOWN file. If this file exists, it
   picks up the IP of the primary Hobbit server from the file, and
   modifies the settings to report data to both the normal (local)
   Hobbit server, and to the primary server. If the file does not exist,
   it will just run the network tests the normal way.
   So to run this, modify the [bbnet] section in hobbitlaunch.cfg and
   change the CMD setting to "$BBHOME/server/ext/failovernet.sh"


The alert failover is different, because Hobbit doesn't have a separate
BBPAGER server - alerts are sent from the same host that handles the
Hobbit data collection and webpages. A solution to this has been
implemented for the next release, where the alerting module can be
distributed onto multiple servers, but only one of them will send alerts
at any given time.


Regards,
Henrik
Attachments (2)
list Henrik Størner · Fri, 2 Nov 2007 15:43:35 +0100 ·
quoted from Paul Williamson
On Fri, Nov 02, 2007 at 08:30:38AM -0400, PAUL WILLIAMSON wrote:
The biggest problem I have with going to Hobbit is there is no snmp trap sending support.  We don't use BB as our main interface for showing all alerts, but it is required and we do send snmp traps from the BBPAGER to our main dashboard of alerts.  Is this functionality in Hobbit yet?
BB really just calls the "snmptrap" utility to send the trap message to
your SNMP based console. A script to do the same from hobbit-alerts.cfg
is a very simple solution to this.


Henrik
list Josh Luthman · Fri, 2 Nov 2007 10:44:17 -0400 ·
So I take it that Joe has to Paypal Henrik $64 now?

Please let me, and everyone else of course, know how the failover script
works on Hobbit.  I'd be very interested in knowing the result to this!

Thanks to all three of you!
quoted from Henrik Størner

On 11/2/07, Henrik Stoerner <user-ce4a2c883f75@xymon.invalid> wrote:
Hi Joe,

On Thu, Nov 01, 2007 at 03:20:12PM -0700, Sloan wrote:
So, the $64 question: Is there anything in hobbit, or on the horizon,
which will allow hobbit to serve as a drop-in replacement for bb,
including the failover capability?
The BB "failover" script does two things: It makes the network tests
run on the failover server if the primary BBNET server cannot be
ping'ed; and it enables alerts being sent from the failover server
if there is no connection from the failover server to the primary
BBPAGER server.


The network-test failover is fairly simple to do. I've attached two
scripts here, both of which must run on the backup/standby/failover
server:

1) failover.sh - goes in ~hobbit/server/ext/
   Add a section to hobbitlaunch.cfg with

      [failovercheck]
        ENVFILE /usr/lib/hobbit/server/etc/hobbitserver.cfg
        NEEDS hobbitd
        CMD $BBHOME/ext/failover.sh 10.0.0.1 hobbitnet.mydom.com

   "10.0.0.1" is the IP of your primary Hobbit server,
   "hobbitnet.mydom.com" is the hostname (in the bb-hosts file) of the
   primary network test machine.

   What this does is that it queries the primary Hobbit server for how
   long ago the network tests were updated. If more than 7 minutes ago
   it deems the primary network test node to be DOWN, and flags this via
   the file $BBTMP/primarynetDOWN. If the network test update was less
   than 7 minutes ago, it removes the file.

   This is then used by the other script, which replaces the CMD in the
   "[bbnet]" section in hobbitlaunch.cfg.

2) failovernet.sh - goes in ~hobbit/server/ext/
   When this runs to do the normal network tests, it will check for the
   presence of the $BBTMP/primarynetDOWN file. If this file exists, it
   picks up the IP of the primary Hobbit server from the file, and
   modifies the settings to report data to both the normal (local)
   Hobbit server, and to the primary server. If the file does not exist,
   it will just run the network tests the normal way.
   So to run this, modify the [bbnet] section in hobbitlaunch.cfg and
   change the CMD setting to "$BBHOME/server/ext/failovernet.sh"


The alert failover is different, because Hobbit doesn't have a separate
BBPAGER server - alerts are sent from the same host that handles the
Hobbit data collection and webpages. A solution to this has been
implemented for the next release, where the alerting module can be
distributed onto multiple servers, but only one of them will send alerts
at any given time.


Regards,
Henrik

-- 
Josh Luthman
Office: XXX-XXX-XXXX
Direct: XXX-XXX-XXXX
XXXX Wayne St
Suite XXXX
Troy, OH XXXXX

Those who don't understand UNIX are condemned to reinvent it, poorly.
--- Henry Spencer
list Josh Luthman · Fri, 2 Nov 2007 10:47:52 -0400 ·
I believe there is a workaround for it using something else.  I know you can
find it in the mailing list archives - it has been asked and discussed many
times.

I read that Henrik has SNMP planned for a later release, too, though I don't
believe that is confirmed.

Josh
quoted from Paul Williamson

On 11/2/07, PAUL WILLIAMSON <user-4a2fa5b5a229@xymon.invalid> wrote:
 The biggest problem I have with going to Hobbit is there is no snmp trap
sending support.  We don't use BB as our main interface for showing all
alerts, but it is required and we do send snmp traps from the BBPAGER to our
main dashboard of alerts.  Is this functionality in Hobbit yet?

************************************
This email may contain privileged and/or confidential information that is
intended solely for the use of the addressee. If you are not the intended
recipient or entity, you are strictly prohibited from disclosing, copying,
distributing or using any of the information contained in the transmission.
If you received this communication in error, please contact the sender
immediately and destroy the material in its entirety, whether electronic or
hard copy. This communication may contain nonpublic personal information
about consumers subject to the restrictions of the Gramm-Leach-Bliley Act
and the Sarbanes-Oxley Act. You may not directly or indirectly reuse or
disclose such information for any purpose other than to provide the services
for which you are receiving the information.
There are risks associated with the use of electronic transmission. The
sender of this information does not control the method of transmittal or
service providers and assumes no duty or obligation for the security,
receipt, or third party interception of this transmission.
************************************
-- 
Josh Luthman
Office: XXX-XXX-XXXX
Direct: XXX-XXX-XXXX
XXXX Wayne St
Suite XXXX
Troy, OH XXXXX

Those who don't understand UNIX are condemned to reinvent it, poorly.
--- Henry Spencer
list Greg L Hubbard · Fri, 2 Nov 2007 10:01:57 -0500 ·
Paul,
 
Drop me a private email and I will tell you what I am doing for my "main
dashboard of alerts."
 
GLH
quoted from Paul Williamson


	From: PAUL WILLIAMSON [mailto:user-4a2fa5b5a229@xymon.invalid] 
	Sent: Friday, November 02, 2007 7:31 AM
	To: user-ae9b8668bcde@xymon.invalid
	Subject: RE: [hobbit] big brother replacement
	
	
	The biggest problem I have with going to Hobbit is there is no
snmp trap sending support.  We don't use BB as our main interface for
showing all alerts, but it is required and we do send snmp traps from
the BBPAGER to our main dashboard of alerts.  Is this functionality in
Hobbit yet?

	************************************
	This email may contain privileged and/or confidential
information that is intended solely for the use of the addressee. If you
are not the intended recipient or entity, you are strictly prohibited
from disclosing, copying, distributing or using any of the information
contained in the transmission. If you received this communication in
error, please contact the sender immediately and destroy the material in
its entirety, whether electronic or hard copy. This communication may
contain nonpublic personal information about consumers subject to the
restrictions of the Gramm-Leach-Bliley Act and the Sarbanes-Oxley Act.
You may not directly or indirectly reuse or disclose such information
for any purpose other than to provide the services for which you are
receiving the information.
	There are risks associated with the use of electronic
transmission. The sender of this information does not control the method
of transmittal or service providers and assumes no duty or obligation
for the security, receipt, or third party interception of this
transmission.
	************************************
list Tod Hansmann · Fri, 2 Nov 2007 09:07:28 -0600 ·
For posterity:

 
We use mrtg to do all our SNMP polling (I know devmon is the popular
solution hear, but it's messy and so multi-filed so we have a bad taste
for it) and for traps we have a script we stopped using recently, but it
was setup much like Henrik stated.  It was just added to
hobbit-alerts.cfg and life was good.
quoted from Greg L Hubbard

 
Tod Hansmann

Network Engineer

 
  <http://www.directpointe.com/>;  


From: Josh Luthman [mailto:user-4c45a83f15cb@xymon.invalid] 
Sent: Friday, November 02, 2007 8:48 AM
To: user-ae9b8668bcde@xymon.invalid
Subject: Re: [hobbit] big brother replacement

 
I believe there is a workaround for it using something else.  I know you
can find it in the mailing list archives - it has been asked and
discussed many times.

I read that Henrik has SNMP planned for a later release, too, though I
don't believe that is confirmed. 

Josh

On 11/2/07, PAUL WILLIAMSON <user-4a2fa5b5a229@xymon.invalid> wrote:

The biggest problem I have with going to Hobbit is there is no snmp trap
sending support.  We don't use BB as our main interface for showing all
alerts, but it is required and we do send snmp traps from the BBPAGER to
our main dashboard of alerts.  Is this functionality in Hobbit yet? 

************************************
This email may contain privileged and/or confidential information that
is intended solely for the use of the addressee. If you are not the
intended recipient or entity, you are strictly prohibited from
disclosing, copying, distributing or using any of the information
contained in the transmission. If you received this communication in
error, please contact the sender immediately and destroy the material in
its entirety, whether electronic or hard copy. This communication may
contain nonpublic personal information about consumers subject to the
restrictions of the Gramm-Leach-Bliley Act and the Sarbanes-Oxley Act.
You may not directly or indirectly reuse or disclose such information
for any purpose other than to provide the services for which you are
receiving the information. 
There are risks associated with the use of electronic transmission. The
sender of this information does not control the method of transmittal or
service providers and assumes no duty or obligation for the security,
receipt, or third party interception of this transmission. 
************************************


-- 
Josh Luthman
Office: XXX-XXX-XXXX
Direct: XXX-XXX-XXXX
XXXX Wayne St
Suite XXXX
Troy, OH XXXXX

Those who don't understand UNIX are condemned to reinvent it, poorly. 
--- Henry Spencer
list Buchan Milne · Fri, 2 Nov 2007 17:47:11 +0200 ·
quoted from Tod Hansmann
On Friday 02 November 2007 17:07:28 Tod Hansmann wrote:
For posterity:


We use mrtg to do all our SNMP polling (I know devmon is the popular
solution hear, but it's messy and so multi-filed so we have a bad taste
for it)
Would you like to expand on what's wrong with devmon? The problem I have with mrtg/cacti etc. is that they *only* do trends, which is half the job. I have got interface graphs (as well as temperature for Dell and HP servers) working, and there is very little that our cacti installation is doing that I still need to have available on devmon.

Regards,
Buchan
list Joe Sloan · Fri, 02 Nov 2007 10:18:24 -0700 ·
This looks promising, I'll give it a whirl -

Joe
quoted from Josh Luthman

Henrik Stoerner wrote:
Hi Joe,

On Thu, Nov 01, 2007 at 03:20:12PM -0700, Sloan wrote:
  
So, the $64 question: Is there anything in hobbit, or on the horizon,
which will allow hobbit to serve as a drop-in replacement for bb,
including the failover capability?
    
The BB "failover" script does two things: It makes the network tests 
run on the failover server if the primary BBNET server cannot be
ping'ed; and it enables alerts being sent from the failover server
if there is no connection from the failover server to the primary
BBPAGER server.


The network-test failover is fairly simple to do. I've attached two
scripts here, both of which must run on the backup/standby/failover 
server:

1) failover.sh - goes in ~hobbit/server/ext/
   Add a section to hobbitlaunch.cfg with

      [failovercheck]
	ENVFILE /usr/lib/hobbit/server/etc/hobbitserver.cfg
	NEEDS hobbitd
	CMD $BBHOME/ext/failover.sh 10.0.0.1 hobbitnet.mydom.com

   "10.0.0.1" is the IP of your primary Hobbit server,
   "hobbitnet.mydom.com" is the hostname (in the bb-hosts file) of the
   primary network test machine.

   What this does is that it queries the primary Hobbit server for how
   long ago the network tests were updated. If more than 7 minutes ago
   it deems the primary network test node to be DOWN, and flags this via
   the file $BBTMP/primarynetDOWN. If the network test update was less
   than 7 minutes ago, it removes the file.

   This is then used by the other script, which replaces the CMD in the
   "[bbnet]" section in hobbitlaunch.cfg.

2) failovernet.sh - goes in ~hobbit/server/ext/
   When this runs to do the normal network tests, it will check for the 
   presence of the $BBTMP/primarynetDOWN file. If this file exists, it
   picks up the IP of the primary Hobbit server from the file, and
   modifies the settings to report data to both the normal (local)
   Hobbit server, and to the primary server. If the file does not exist,
   it will just run the network tests the normal way.
   So to run this, modify the [bbnet] section in hobbitlaunch.cfg and
   change the CMD setting to "$BBHOME/server/ext/failovernet.sh"


The alert failover is different, because Hobbit doesn't have a separate
BBPAGER server - alerts are sent from the same host that handles the
Hobbit data collection and webpages. A solution to this has been
implemented for the next release, where the alerting module can be
distributed onto multiple servers, but only one of them will send alerts
at any given time.


Regards,
Henrik

  
list Joe Sloan · Fri, 02 Nov 2007 10:19:35 -0700 ·
Yes, but keep in mind that's $64 octal.

Joe
quoted from Josh Luthman

Josh Luthman wrote:
So I take it that Joe has to Paypal Henrik $64 now?

Please let me, and everyone else of course, know how the failover
script works on Hobbit.  I'd be very interested in knowing the result
to this!

Thanks to all three of you!

On 11/2/07, *Henrik Stoerner* <user-ce4a2c883f75@xymon.invalid <mailto:user-ce4a2c883f75@xymon.invalid>>
quoted from Joe Sloan
wrote:

    Hi Joe,

    On Thu, Nov 01, 2007 at 03:20:12PM -0700, Sloan wrote:
So, the $64 question: Is there anything in hobbit, or on the
    horizon,
which will allow hobbit to serve as a drop-in replacement for bb,
including the failover capability?
    The BB "failover" script does two things: It makes the network tests
    run on the failover server if the primary BBNET server cannot be
    ping'ed; and it enables alerts being sent from the failover server
    if there is no connection from the failover server to the primary
    BBPAGER server.


    The network-test failover is fairly simple to do. I've attached two
    scripts here, both of which must run on the backup/standby/failover
    server:

    1) failover.sh - goes in ~hobbit/server/ext/
       Add a section to hobbitlaunch.cfg with

          [failovercheck]
            ENVFILE /usr/lib/hobbit/server/etc/hobbitserver.cfg
            NEEDS hobbitd

            CMD $BBHOME/ext/failover.sh 10.0.0.1 <http://10.0.0.1>;
    hobbitnet.mydom.com <http://hobbitnet.mydom.com>;

       "10.0.0.1 <http://10.0.0.1>"; is the IP of your primary Hobbit
    server,
       "hobbitnet.mydom.com <http://hobbitnet.mydom.com>"; is the
quoted from Joe Sloan
    hostname (in the bb-hosts file) of the
       primary network test machine.

       What this does is that it queries the primary Hobbit server for
    how
       long ago the network tests were updated. If more than 7 minutes ago
       it deems the primary network test node to be DOWN, and flags
    this via
       the file $BBTMP/primarynetDOWN. If the network test update was
    less
       than 7 minutes ago, it removes the file.

       This is then used by the other script, which replaces the CMD
    in the
       "[bbnet]" section in hobbitlaunch.cfg.

    2) failovernet.sh - goes in ~hobbit/server/ext/
       When this runs to do the normal network tests, it will check
    for the
       presence of the $BBTMP/primarynetDOWN file. If this file exists, it
       picks up the IP of the primary Hobbit server from the file, and
       modifies the settings to report data to both the normal (local)
       Hobbit server, and to the primary server. If the file does not
    exist,
       it will just run the network tests the normal way.
       So to run this, modify the [bbnet] section in hobbitlaunch.cfg and
       change the CMD setting to "$BBHOME/server/ext/failovernet.sh"


    The alert failover is different, because Hobbit doesn't have a
    separate
    BBPAGER server - alerts are sent from the same host that handles the
    Hobbit data collection and webpages. A solution to this has been
    implemented for the next release, where the alerting module can be
    distributed onto multiple servers, but only one of them will send
    alerts
    at any given time.


    Regards,
    Henrik


-- 
Josh Luthman
Office: XXX-XXX-XXXX
Direct: XXX-XXX-XXXX
XXXX Wayne St
Suite XXXX
Troy, OH XXXXX

Those who don't understand UNIX are condemned to reinvent it, poorly.
--- Henry Spencer
list Joe Sloan · Fri, 02 Nov 2007 10:20:48 -0700 ·
Hello Tod,

Since you've stopped using the script, would you be willing to share it
for posterity sake?

Joe
quoted from Tod Hansmann

Tod Hansmann wrote:
For posterity:

 
We use mrtg to do all our SNMP polling (I know devmon is the popular
solution hear, but it’s messy and so multi-filed so we have a bad
taste for it) and for traps we have a script we stopped using
recently, but it was setup much like Henrik stated.  It was just added
to hobbit-alerts.cfg and life was good.

 
*Tod Hansmann*

Network Engineer

 
[http://www.directpointe.com/] <http://www.directpointe.com/>; 


*From:* Josh Luthman [mailto:user-4c45a83f15cb@xymon.invalid]
*Sent:* Friday, November 02, 2007 8:48 AM
*To:* user-ae9b8668bcde@xymon.invalid
*Subject:* Re: [hobbit] big brother replacement

 
I believe there is a workaround for it using something else.  I know
you can find it in the mailing list archives - it has been asked and
discussed many times.

I read that Henrik has SNMP planned for a later release, too, though I
don't believe that is confirmed.

Josh

On 11/2/07, *PAUL WILLIAMSON* <user-4a2fa5b5a229@xymon.invalid
<mailto:user-4a2fa5b5a229@xymon.invalid>> wrote:

The biggest problem I have with going to Hobbit is there is no snmp
trap sending support.  We don't use BB as our main interface for
showing all alerts, but it is required and we do send snmp traps from
the BBPAGER to our main dashboard of alerts.  Is this functionality in
Hobbit yet?

************************************
This email may contain privileged and/or confidential information that
is intended solely for the use of the addressee. If you are not the
intended recipient or entity, you are strictly prohibited from
disclosing, copying, distributing or using any of the information
contained in the transmission. If you received this communication in
error, please contact the sender immediately and destroy the material
in its entirety, whether electronic or hard copy. This communication
may contain nonpublic personal information about consumers subject to
the restrictions of the Gramm-Leach-Bliley Act and the Sarbanes-Oxley
Act. You may not directly or indirectly reuse or disclose such
information for any purpose other than to provide the services for
which you are receiving the information.
There are risks associated with the use of electronic transmission.
The sender of this information does not control the method of
transmittal or service providers and assumes no duty or obligation for
the security, receipt, or third party interception of this transmission.
************************************


-- 
Josh Luthman
Office: XXX-XXX-XXXX
Direct: XXX-XXX-XXXX
XXXX Wayne St
Suite XXXX
Troy, OH XXXXX

Those who don't understand UNIX are condemned to reinvent it, poorly.
--- Henry Spencer
list Galen Johnson · Fri, 2 Nov 2007 13:27:02 -0400 ·
Aw...don't be cheap...go ahead and kick in the other $12...
quoted from Joe Sloan

-----Original Message-----
From: Sloan [mailto:user-b1d2c84d244b@xymon.invalid] 
Sent: Friday, November 02, 2007 1:20 PM
To: user-ae9b8668bcde@xymon.invalid
Subject: Re: [hobbit] big brother replacement

Yes, but keep in mind that's $64 octal.

Joe

Josh Luthman wrote:
So I take it that Joe has to Paypal Henrik $64 now?

Please let me, and everyone else of course, know how the failover
script works on Hobbit.  I'd be very interested in knowing the result
to this!

Thanks to all three of you!

On 11/2/07, *Henrik Stoerner* <user-ce4a2c883f75@xymon.invalid <mailto:user-ce4a2c883f75@xymon.invalid>>
wrote:

    Hi Joe,

    On Thu, Nov 01, 2007 at 03:20:12PM -0700, Sloan wrote:
So, the $64 question: Is there anything in hobbit, or on the
    horizon,
which will allow hobbit to serve as a drop-in replacement for
bb,
including the failover capability?
    The BB "failover" script does two things: It makes the network
tests
    run on the failover server if the primary BBNET server cannot be
    ping'ed; and it enables alerts being sent from the failover server
    if there is no connection from the failover server to the primary
    BBPAGER server.


    The network-test failover is fairly simple to do. I've attached
two
    scripts here, both of which must run on the
backup/standby/failover
    server:

    1) failover.sh - goes in ~hobbit/server/ext/
       Add a section to hobbitlaunch.cfg with

          [failovercheck]
            ENVFILE /usr/lib/hobbit/server/etc/hobbitserver.cfg
            NEEDS hobbitd
            CMD $BBHOME/ext/failover.sh 10.0.0.1 <http://10.0.0.1>;
    hobbitnet.mydom.com <http://hobbitnet.mydom.com>;

       "10.0.0.1 <http://10.0.0.1>"; is the IP of your primary Hobbit
    server,
       "hobbitnet.mydom.com <http://hobbitnet.mydom.com>"; is the
    hostname (in the bb-hosts file) of the
       primary network test machine.

       What this does is that it queries the primary Hobbit server for
    how
       long ago the network tests were updated. If more than 7 minutes
ago
       it deems the primary network test node to be DOWN, and flags
    this via
       the file $BBTMP/primarynetDOWN. If the network test update was
    less
       than 7 minutes ago, it removes the file.

       This is then used by the other script, which replaces the CMD
    in the
       "[bbnet]" section in hobbitlaunch.cfg.

    2) failovernet.sh - goes in ~hobbit/server/ext/
       When this runs to do the normal network tests, it will check
    for the
       presence of the $BBTMP/primarynetDOWN file. If this file
exists, it
       picks up the IP of the primary Hobbit server from the file, and
       modifies the settings to report data to both the normal (local)
       Hobbit server, and to the primary server. If the file does not
    exist,
       it will just run the network tests the normal way.
       So to run this, modify the [bbnet] section in hobbitlaunch.cfg
and
       change the CMD setting to "$BBHOME/server/ext/failovernet.sh"


    The alert failover is different, because Hobbit doesn't have a
    separate
    BBPAGER server - alerts are sent from the same host that handles
the
    Hobbit data collection and webpages. A solution to this has been
    implemented for the next release, where the alerting module can be
    distributed onto multiple servers, but only one of them will send
    alerts
    at any given time.


    Regards,
    Henrik


-- 
Josh Luthman
Office: XXX-XXX-XXXX
Direct: XXX-XXX-XXXX
XXXX Wayne St
Suite XXXX
Troy, OH XXXXX

Those who don't understand UNIX are condemned to reinvent it, poorly.
--- Henry Spencer
list Tod Hansmann · Fri, 2 Nov 2007 11:38:46 -0600 ·
Sorry, no can do.  It's long gone.  I didn't set it up but I'm pretty
sure we just got it off of deadcat (though I just did a couple searches
and found nothing like what we were using).  

Probably wouldn't be hard to write a new one for someone who knows snmp
"syntax" well enough.  Just pass the hobbit message to the server
receiving traps, and you're golden.

It wasn't exactly helpful for us when we use the emails and the display
page as the key modes of contact for alarms in our environment.
quoted from Joe Sloan

Tod Hansmann
Network Engineer
 
 
-----Original Message-----
From: Sloan [mailto:user-b1d2c84d244b@xymon.invalid] 
Sent: Friday, November 02, 2007 11:21 AM
To: user-ae9b8668bcde@xymon.invalid
Subject: Re: [hobbit] big brother replacement

Hello Tod,

Since you've stopped using the script, would you be willing to share it
for posterity sake?

Joe

Tod Hansmann wrote:
For posterity:

 
We use mrtg to do all our SNMP polling (I know devmon is the popular
solution hear, but it's messy and so multi-filed so we have a bad
taste for it) and for traps we have a script we stopped using
recently, but it was setup much like Henrik stated.  It was just added
to hobbit-alerts.cfg and life was good.

 
*Tod Hansmann*

Network Engineer

 
[http://www.directpointe.com/] <http://www.directpointe.com/>; 


*From:* Josh Luthman [mailto:user-4c45a83f15cb@xymon.invalid]
*Sent:* Friday, November 02, 2007 8:48 AM
*To:* user-ae9b8668bcde@xymon.invalid
*Subject:* Re: [hobbit] big brother replacement

 
I believe there is a workaround for it using something else.  I know
you can find it in the mailing list archives - it has been asked and
discussed many times.

I read that Henrik has SNMP planned for a later release, too, though I
don't believe that is confirmed.

Josh

On 11/2/07, *PAUL WILLIAMSON* <user-4a2fa5b5a229@xymon.invalid
<mailto:user-4a2fa5b5a229@xymon.invalid>> wrote:

The biggest problem I have with going to Hobbit is there is no snmp
trap sending support.  We don't use BB as our main interface for
showing all alerts, but it is required and we do send snmp traps from
the BBPAGER to our main dashboard of alerts.  Is this functionality in
Hobbit yet?

************************************
This email may contain privileged and/or confidential information that
is intended solely for the use of the addressee. If you are not the
intended recipient or entity, you are strictly prohibited from
disclosing, copying, distributing or using any of the information
contained in the transmission. If you received this communication in
error, please contact the sender immediately and destroy the material
in its entirety, whether electronic or hard copy. This communication
may contain nonpublic personal information about consumers subject to
the restrictions of the Gramm-Leach-Bliley Act and the Sarbanes-Oxley
Act. You may not directly or indirectly reuse or disclose such
information for any purpose other than to provide the services for
which you are receiving the information.
There are risks associated with the use of electronic transmission.
The sender of this information does not control the method of
transmittal or service providers and assumes no duty or obligation for
the security, receipt, or third party interception of this
transmission.
************************************


-- 
Josh Luthman
Office: XXX-XXX-XXXX
Direct: XXX-XXX-XXXX
XXXX Wayne St
Suite XXXX
Troy, OH XXXXX

Those who don't understand UNIX are condemned to reinvent it, poorly.
--- Henry Spencer
list Joe Sloan · Fri, 02 Nov 2007 11:30:58 -0700 ·
Galen Johnson wrote:
Aw...don't be cheap...go ahead and kick in the other $12...
  
OK you win - $40 hex, as soon as I get paid.

Joe
quoted from Galen Johnson
-----Original Message-----
From: Sloan [mailto:user-b1d2c84d244b@xymon.invalid] 
Sent: Friday, November 02, 2007 1:20 PM
To: user-ae9b8668bcde@xymon.invalid
Subject: Re: [hobbit] big brother replacement

Yes, but keep in mind that's $64 octal.

Joe

Josh Luthman wrote:
  
So I take it that Joe has to Paypal Henrik $64 now?

Please let me, and everyone else of course, know how the failover
script works on Hobbit.  I'd be very interested in knowing the result
to this!

Thanks to all three of you!

On 11/2/07, *Henrik Stoerner* <user-ce4a2c883f75@xymon.invalid <mailto:user-ce4a2c883f75@xymon.invalid>>
wrote:

    Hi Joe,

    On Thu, Nov 01, 2007 at 03:20:12PM -0700, Sloan wrote:
So, the $64 question: Is there anything in hobbit, or on the
    horizon,
which will allow hobbit to serve as a drop-in replacement for
    
bb,
  
including the failover capability?
    The BB "failover" script does two things: It makes the network
    
tests
  
    run on the failover server if the primary BBNET server cannot be
    ping'ed; and it enables alerts being sent from the failover server
    if there is no connection from the failover server to the primary
    BBPAGER server.


    The network-test failover is fairly simple to do. I've attached
    
two
  
    scripts here, both of which must run on the
    
backup/standby/failover
  
    server:

    1) failover.sh - goes in ~hobbit/server/ext/
       Add a section to hobbitlaunch.cfg with

          [failovercheck]
            ENVFILE /usr/lib/hobbit/server/etc/hobbitserver.cfg
            NEEDS hobbitd
            CMD $BBHOME/ext/failover.sh 10.0.0.1 <http://10.0.0.1>;
    hobbitnet.mydom.com <http://hobbitnet.mydom.com>;

       "10.0.0.1 <http://10.0.0.1>"; is the IP of your primary Hobbit
    server,
       "hobbitnet.mydom.com <http://hobbitnet.mydom.com>"; is the
    hostname (in the bb-hosts file) of the
       primary network test machine.

       What this does is that it queries the primary Hobbit server for
    how
       long ago the network tests were updated. If more than 7 minutes
    
ago
  
       it deems the primary network test node to be DOWN, and flags
    this via
       the file $BBTMP/primarynetDOWN. If the network test update was
    less
       than 7 minutes ago, it removes the file.

       This is then used by the other script, which replaces the CMD
    in the
       "[bbnet]" section in hobbitlaunch.cfg.

    2) failovernet.sh - goes in ~hobbit/server/ext/
       When this runs to do the normal network tests, it will check
    for the
       presence of the $BBTMP/primarynetDOWN file. If this file
    
exists, it
  
       picks up the IP of the primary Hobbit server from the file, and
       modifies the settings to report data to both the normal (local)
       Hobbit server, and to the primary server. If the file does not
    exist,
       it will just run the network tests the normal way.
       So to run this, modify the [bbnet] section in hobbitlaunch.cfg
    
and
  
       change the CMD setting to "$BBHOME/server/ext/failovernet.sh"


    The alert failover is different, because Hobbit doesn't have a
    separate
    BBPAGER server - alerts are sent from the same host that handles
    
the
  
    Hobbit data collection and webpages. A solution to this has been
    implemented for the next release, where the alerting module can be
    distributed onto multiple servers, but only one of them will send
    alerts
    at any given time.


    Regards,
    Henrik


-- 
Josh Luthman
Office: XXX-XXX-XXXX
Direct: XXX-XXX-XXXX
XXXX Wayne St
Suite XXXX
Troy, OH XXXXX

Those who don't understand UNIX are condemned to reinvent it, poorly.
--- Henry Spencer 
    
list Galen Johnson · Fri, 2 Nov 2007 14:44:07 -0400 ·
Just offer $1000000 binary...man, I'm on geek overload...
quoted from Joe Sloan

-----Original Message-----
From: Sloan [mailto:user-b1d2c84d244b@xymon.invalid] 
Sent: Friday, November 02, 2007 2:31 PM
To: user-ae9b8668bcde@xymon.invalid
Subject: Re: [hobbit] big brother replacement

Galen Johnson wrote:
Aw...don't be cheap...go ahead and kick in the other $12...
  
OK you win - $40 hex, as soon as I get paid.

Joe
-----Original Message-----
From: Sloan [mailto:user-b1d2c84d244b@xymon.invalid] 
Sent: Friday, November 02, 2007 1:20 PM
To: user-ae9b8668bcde@xymon.invalid
Subject: Re: [hobbit] big brother replacement

Yes, but keep in mind that's $64 octal.

Joe

Josh Luthman wrote:
  
So I take it that Joe has to Paypal Henrik $64 now?

Please let me, and everyone else of course, know how the failover
script works on Hobbit.  I'd be very interested in knowing the result
to this!

Thanks to all three of you!

On 11/2/07, *Henrik Stoerner* <user-ce4a2c883f75@xymon.invalid
wrote:

    Hi Joe,

    On Thu, Nov 01, 2007 at 03:20:12PM -0700, Sloan wrote:
So, the $64 question: Is there anything in hobbit, or on the
    horizon,
which will allow hobbit to serve as a drop-in replacement for
    
bb,
  
including the failover capability?
    The BB "failover" script does two things: It makes the network
    
tests
  
    run on the failover server if the primary BBNET server cannot be
    ping'ed; and it enables alerts being sent from the failover
server
    if there is no connection from the failover server to the primary
    BBPAGER server.


    The network-test failover is fairly simple to do. I've attached
    
two
  
    scripts here, both of which must run on the
    
backup/standby/failover
  
    server:

    1) failover.sh - goes in ~hobbit/server/ext/
       Add a section to hobbitlaunch.cfg with

          [failovercheck]
            ENVFILE /usr/lib/hobbit/server/etc/hobbitserver.cfg
            NEEDS hobbitd
            CMD $BBHOME/ext/failover.sh 10.0.0.1 <http://10.0.0.1>;
    hobbitnet.mydom.com <http://hobbitnet.mydom.com>;

       "10.0.0.1 <http://10.0.0.1>"; is the IP of your primary Hobbit
    server,
       "hobbitnet.mydom.com <http://hobbitnet.mydom.com>"; is the
    hostname (in the bb-hosts file) of the
       primary network test machine.

       What this does is that it queries the primary Hobbit server
for
    how
       long ago the network tests were updated. If more than 7
minutes
    
ago
  
       it deems the primary network test node to be DOWN, and flags
    this via
       the file $BBTMP/primarynetDOWN. If the network test update was
    less
       than 7 minutes ago, it removes the file.

       This is then used by the other script, which replaces the CMD
    in the
       "[bbnet]" section in hobbitlaunch.cfg.

    2) failovernet.sh - goes in ~hobbit/server/ext/
       When this runs to do the normal network tests, it will check
    for the
       presence of the $BBTMP/primarynetDOWN file. If this file
    
exists, it
  
       picks up the IP of the primary Hobbit server from the file,
and
       modifies the settings to report data to both the normal
(local)
       Hobbit server, and to the primary server. If the file does not
    exist,
       it will just run the network tests the normal way.
       So to run this, modify the [bbnet] section in hobbitlaunch.cfg
    
and
  
       change the CMD setting to "$BBHOME/server/ext/failovernet.sh"


    The alert failover is different, because Hobbit doesn't have a
    separate
    BBPAGER server - alerts are sent from the same host that handles
    
the
  
    Hobbit data collection and webpages. A solution to this has been
    implemented for the next release, where the alerting module can
be
    distributed onto multiple servers, but only one of them will send
    alerts
    at any given time.


    Regards,
    Henrik


-- 
Josh Luthman
Office: XXX-XXX-XXXX
Direct: XXX-XXX-XXXX
XXXX Wayne St
Suite XXXX
Troy, OH XXXXX

Those who don't understand UNIX are condemned to reinvent it, poorly.
--- Henry Spencer 
    
list Gary Baluha · Fri, 2 Nov 2007 16:34:55 -0400 ·
On 11/2/07, Galen Johnson <user-87f955643e3d@xymon.invalid> wrote:
Just offer $1000000 binary...man, I'm on geek overload...

::shakes head::
quoted from Galen Johnson

-----Original Message-----
From: Sloan [mailto:user-b1d2c84d244b@xymon.invalid]
Sent: Friday, November 02, 2007 2:31 PM
To: user-ae9b8668bcde@xymon.invalid
Subject: Re: [hobbit] big brother replacement

Galen Johnson wrote:
Aw...don't be cheap...go ahead and kick in the other $12...
OK you win - $40 hex, as soon as I get paid.

Joe
-----Original Message-----
From: Sloan [mailto:user-b1d2c84d244b@xymon.invalid]
Sent: Friday, November 02, 2007 1:20 PM
To: user-ae9b8668bcde@xymon.invalid
Subject: Re: [hobbit] big brother replacement

Yes, but keep in mind that's $64 octal.

Joe

Josh Luthman wrote:
So I take it that Joe has to Paypal Henrik $64 now?

Please let me, and everyone else of course, know how the failover
script works on Hobbit.  I'd be very interested in knowing the result
to this!

Thanks to all three of you!

On 11/2/07, *Henrik Stoerner* <user-ce4a2c883f75@xymon.invalid
wrote:

    Hi Joe,

    On Thu, Nov 01, 2007 at 03:20:12PM -0700, Sloan wrote:
So, the $64 question: Is there anything in hobbit, or on the
    horizon,
which will allow hobbit to serve as a drop-in replacement for
bb,
including the failover capability?
    The BB "failover" script does two things: It makes the network
tests
    run on the failover server if the primary BBNET server cannot be
    ping'ed; and it enables alerts being sent from the failover
server
    if there is no connection from the failover server to the primary
    BBPAGER server.


    The network-test failover is fairly simple to do. I've attached
two
    scripts here, both of which must run on the
backup/standby/failover
    server:

    1) failover.sh - goes in ~hobbit/server/ext/
       Add a section to hobbitlaunch.cfg with

          [failovercheck]
            ENVFILE /usr/lib/hobbit/server/etc/hobbitserver.cfg
            NEEDS hobbitd
            CMD $BBHOME/ext/failover.sh 10.0.0.1 <http://10.0.0.1>;
    hobbitnet.mydom.com <http://hobbitnet.mydom.com>;

       "10.0.0.1 <http://10.0.0.1>"; is the IP of your primary Hobbit
    server,
       "hobbitnet.mydom.com <http://hobbitnet.mydom.com>"; is the
    hostname (in the bb-hosts file) of the
       primary network test machine.

       What this does is that it queries the primary Hobbit server
for
    how
       long ago the network tests were updated. If more than 7
minutes
ago
       it deems the primary network test node to be DOWN, and flags
    this via
       the file $BBTMP/primarynetDOWN. If the network test update was
    less
       than 7 minutes ago, it removes the file.

       This is then used by the other script, which replaces the CMD
    in the
       "[bbnet]" section in hobbitlaunch.cfg.

    2) failovernet.sh - goes in ~hobbit/server/ext/
       When this runs to do the normal network tests, it will check
    for the
       presence of the $BBTMP/primarynetDOWN file. If this file
exists, it
       picks up the IP of the primary Hobbit server from the file,
and
       modifies the settings to report data to both the normal
(local)
       Hobbit server, and to the primary server. If the file does not
    exist,
       it will just run the network tests the normal way.
       So to run this, modify the [bbnet] section in hobbitlaunch.cfg
and
       change the CMD setting to "$BBHOME/server/ext/failovernet.sh"


    The alert failover is different, because Hobbit doesn't have a
    separate
    BBPAGER server - alerts are sent from the same host that handles
the
    Hobbit data collection and webpages. A solution to this has been
    implemented for the next release, where the alerting module can
be
    distributed onto multiple servers, but only one of them will send
    alerts
    at any given time.


    Regards,
    Henrik


--
Josh Luthman
Office: XXX-XXX-XXXX
Direct: XXX-XXX-XXXX
XXXX Wayne St
Suite XXXX
Troy, OH XXXXX

Those who don't understand UNIX are condemned to reinvent it, poorly.
--- Henry Spencer
list Joe Sloan · Fri, 02 Nov 2007 14:48:32 -0700 ·
quoted from Gary Baluha
Henrik Stoerner wrote:
The alert failover is different, because Hobbit doesn't have a separate
BBPAGER server - alerts are sent from the same host that handles the
Hobbit data collection and webpages. A solution to this has been
implemented for the next release, where the alerting module can be
distributed onto multiple servers, but only one of them will send alerts
at any given time.

  
So the distributed alerting capability will be in the
soon-to-be-released 4.3.0?

Joe
list Joe Sloan · Fri, 02 Nov 2007 16:43:50 -0700 ·
quoted from Mark Deiss
Deiss, Mark wrote:
For a vanilla BB environment, you can have multiple BBDISPLAY entities
but the recommendation is that there is only one BBNET entity. A BBNET
server that is generating the pings out to the clients will be sending
the ping results to all of the BBDISPLAY entities (as defined on the
BBNET host). If you have multiple BBNET entities that ping the same
servers, you will be sending duplicated results as far as the
individual BBDISPLAY servers are concerned (the connection messages
will be renamed to the host being pinged). To support multiple BBNETs
in a non-race environment requires additional coding to carefully
direct the BBNET results to not trip over each other. The default
behavior is to pump them out to whatever BBDISPLAY is listed - you get
the race conditions when you want all the BBDISPLAY servers to monitor
all of the BBNET hosts (i.e. want BBNET to send their client-side
tests to the BBDISPLAY entities - this will result in the BBNET poll
messages going out to all the BBDISPLAY entities also).
Interestingly enough, we've been running redundant bb servers for each
lan, without any concern for race conditions and while that has it's own
peculiar behavior in corner cases, we've never seen any sort of real,
intractable problems with it. The general consensus here is that
redundancy is good, except for the notifications - we don't want to be
notified twice for every incident, thus the so-called bb "failover"
capability saves us that annoyance with no extra hacks required.

I probably made it sound a lot more sophisticated than it really is - we
really just have active/active BBNET/BBDISPLAY servers, with the
delegation of BBPAGER decided by the failover status.

It looks like Henrik has a good roadmap to get there in 4.3 from what I
read here, so hopefully we've got our bb replacement at last. The only
other concern is that we copy all bb notifications as snmp traps to
netcool, but it looks as though that should be with a hobbit plugin.

Joe
list Josh Luthman · Fri, 2 Nov 2007 21:17:10 -0400 ·
Joe,

Do you have any support to any extent with BB?  The main reason I switched
was that there was a mailing list to look to for support.  Secondly, it
wasn't BB.

Josh
quoted from Joe Sloan

On 11/2/07, Sloan <user-b1d2c84d244b@xymon.invalid> wrote:
Deiss, Mark wrote:
For a vanilla BB environment, you can have multiple BBDISPLAY entities
but the recommendation is that there is only one BBNET entity. A BBNET
server that is generating the pings out to the clients will be sending
the ping results to all of the BBDISPLAY entities (as defined on the
BBNET host). If you have multiple BBNET entities that ping the same
servers, you will be sending duplicated results as far as the
individual BBDISPLAY servers are concerned (the connection messages
will be renamed to the host being pinged). To support multiple BBNETs
in a non-race environment requires additional coding to carefully
direct the BBNET results to not trip over each other. The default
behavior is to pump them out to whatever BBDISPLAY is listed - you get
the race conditions when you want all the BBDISPLAY servers to monitor
all of the BBNET hosts (i.e. want BBNET to send their client-side
tests to the BBDISPLAY entities - this will result in the BBNET poll
messages going out to all the BBDISPLAY entities also).
Interestingly enough, we've been running redundant bb servers for each
lan, without any concern for race conditions and while that has it's own
peculiar behavior in corner cases, we've never seen any sort of real,
intractable problems with it. The general consensus here is that
redundancy is good, except for the notifications - we don't want to be
notified twice for every incident, thus the so-called bb "failover"
capability saves us that annoyance with no extra hacks required.

I probably made it sound a lot more sophisticated than it really is - we
really just have active/active BBNET/BBDISPLAY servers, with the
delegation of BBPAGER decided by the failover status.

It looks like Henrik has a good roadmap to get there in 4.3 from what I
read here, so hopefully we've got our bb replacement at last. The only
other concern is that we copy all bb notifications as snmp traps to
netcool, but it looks as though that should be with a hobbit plugin.

Joe

-- 
Josh Luthman
Office: XXX-XXX-XXXX
Direct: XXX-XXX-XXXX
XXXX Wayne St
Suite XXXX
Troy, OH XXXXX

Those who don't understand UNIX are condemned to reinvent it, poorly.
--- Henry Spencer
list Joe Sloan · Fri, 02 Nov 2007 19:28:41 -0700 ·
We have no official bb support, just google, and the experience of local sys
admins. But it's been here so long that infrastructure has grown up around it,
which really drives the need for a drop-in replacement.

Joe
quoted from Josh Luthman

Josh Luthman wrote:
Joe,

Do you have any support to any extent with BB?  The main reason I
switched was that there was a mailing list to look to for support. 
Secondly, it wasn't BB.

Josh

On 11/2/07, *Sloan* <user-b1d2c84d244b@xymon.invalid <mailto:user-b1d2c84d244b@xymon.invalid>> wrote:

    Deiss, Mark wrote:
For a vanilla BB environment, you can have multiple BBDISPLAY entities
but the recommendation is that there is only one BBNET entity. A BBNET
server that is generating the pings out to the clients will be
    sending
the ping results to all of the BBDISPLAY entities (as defined on the
BBNET host). If you have multiple BBNET entities that ping the same
servers, you will be sending duplicated results as far as the
individual BBDISPLAY servers are concerned (the connection messages
will be renamed to the host being pinged). To support multiple BBNETs
in a non-race environment requires additional coding to carefully
direct the BBNET results to not trip over each other. The default
behavior is to pump them out to whatever BBDISPLAY is listed - you get
the race conditions when you want all the BBDISPLAY servers to
    monitor
all of the BBNET hosts (i.e. want BBNET to send their client-side
tests to the BBDISPLAY entities - this will result in the BBNET poll
messages going out to all the BBDISPLAY entities also).
    Interestingly enough, we've been running redundant bb servers for each
    lan, without any concern for race conditions and while that has it's own
    peculiar behavior in corner cases, we've never seen any sort of real,
    intractable problems with it. The general consensus here is that
    redundancy is good, except for the notifications - we don't want to be
    notified twice for every incident, thus the so-called bb "failover"
    capability saves us that annoyance with no extra hacks required.

    I probably made it sound a lot more sophisticated than it really is - we
    really just have active/active BBNET/BBDISPLAY servers, with the
    delegation of BBPAGER decided by the failover status.

    It looks like Henrik has a good roadmap to get there in 4.3 from what I
    read here, so hopefully we've got our bb replacement at last. The only
    other concern is that we copy all bb notifications as snmp traps to
    netcool, but it looks as though that should be with a hobbit plugin.

    Joe


-- 
Josh Luthman
Office: XXX-XXX-XXXX
Direct: XXX-XXX-XXXX
XXXX Wayne St
Suite XXXX
Troy, OH XXXXX

Those who don't understand UNIX are condemned to reinvent it, poorly.
--- Henry Spencer
list Josh Luthman · Fri, 2 Nov 2007 22:46:05 -0400 ·
I really like this mailing list - you learn quite a bit even just by reading
the questions of others.  I'm the only system administrator around here so I
need a shoulder to lean on every once and a while!

Are you saying you've used BB so much that you know it too well and need a
replacement?  Sounds kind of backwards to me =P
quoted from Joe Sloan

On 11/2/07, joe <user-b1d2c84d244b@xymon.invalid> wrote:
We have no official bb support, just google, and the experience of local
sys
admins. But it's been here so long that infrastructure has grown up around
it,
which really drives the need for a drop-in replacement.

Joe

Josh Luthman wrote:
Joe,

Do you have any support to any extent with BB?  The main reason I
switched was that there was a mailing list to look to for support.
Secondly, it wasn't BB.

Josh

On 11/2/07, *Sloan* <user-b1d2c84d244b@xymon.invalid <mailto:user-b1d2c84d244b@xymon.invalid>> wrote:

    Deiss, Mark wrote:
For a vanilla BB environment, you can have multiple BBDISPLAY
entities
but the recommendation is that there is only one BBNET entity. A
BBNET
server that is generating the pings out to the clients will be
    sending
the ping results to all of the BBDISPLAY entities (as defined on
the
BBNET host). If you have multiple BBNET entities that ping the
same
servers, you will be sending duplicated results as far as the
individual BBDISPLAY servers are concerned (the connection
messages
will be renamed to the host being pinged). To support multiple
BBNETs
in a non-race environment requires additional coding to carefully
direct the BBNET results to not trip over each other. The default
behavior is to pump them out to whatever BBDISPLAY is listed - you
get
the race conditions when you want all the BBDISPLAY servers to
    monitor
all of the BBNET hosts (i.e. want BBNET to send their client-side
tests to the BBDISPLAY entities - this will result in the BBNET
poll
messages going out to all the BBDISPLAY entities also).
    Interestingly enough, we've been running redundant bb servers for
each
    lan, without any concern for race conditions and while that has it's
own
    peculiar behavior in corner cases, we've never seen any sort of
real,
    intractable problems with it. The general consensus here is that
    redundancy is good, except for the notifications - we don't want to
be
    notified twice for every incident, thus the so-called bb "failover"
    capability saves us that annoyance with no extra hacks required.

    I probably made it sound a lot more sophisticated than it really is
- we
    really just have active/active BBNET/BBDISPLAY servers, with the
    delegation of BBPAGER decided by the failover status.

    It looks like Henrik has a good roadmap to get there in 4.3 from
what I
    read here, so hopefully we've got our bb replacement at last. The
only
    other concern is that we copy all bb notifications as snmp traps to
    netcool, but it looks as though that should be with a hobbit plugin.

    Joe


--
Josh Luthman
Office: XXX-XXX-XXXX
Direct: XXX-XXX-XXXX
XXXX Wayne St
Suite XXXX
Troy, OH XXXXX

Those who don't understand UNIX are condemned to reinvent it, poorly.
--- Henry Spencer
-- 
Josh Luthman
Office: XXX-XXX-XXXX
Direct: XXX-XXX-XXXX
XXXX Wayne St
Suite XXXX
Troy, OH XXXXX

Those who don't understand UNIX are condemned to reinvent it, poorly.
--- Henry Spencer
list Ryan Jay B. Lapuz · Sat, 3 Nov 2007 14:27:42 +0800 ·
Hi Henrick,

I just setup a backup server for our primary hobbit server yesterday that 
was inspired by this:

From Henrick:
quoted from Tod Hansmann
I run two completely separate systems in parallel, and have the clients
report to both of them. The system at our disaster center has the paging
module disabled (just disable the [bbpage] section in hobbitlaunch.cfg),
to avoid double alerts - it is simple to activate it, if necessary. --  

((manual I think))
Now that is my current setup, however you just created a failover script 
which will make the failover transition automatic.The concept is just the 
same but,  is the clients will still report to both HB servers if there is 
no failover?

Thanks and regards,
Ryan
quoted from Gary Baluha

----- Original Message ----- 
From: "Henrik Stoerner" <user-ce4a2c883f75@xymon.invalid>
To: <user-ae9b8668bcde@xymon.invalid>
Sent: Friday, November 02, 2007 10:37 PM
Subject: Re: [hobbit] big brother replacement

Hi Joe,

On Thu, Nov 01, 2007 at 03:20:12PM -0700, Sloan wrote:
So, the $64 question: Is there anything in hobbit, or on the horizon,
which will allow hobbit to serve as a drop-in replacement for bb,
including the failover capability?
The BB "failover" script does two things: It makes the network tests
run on the failover server if the primary BBNET server cannot be
ping'ed; and it enables alerts being sent from the failover server
if there is no connection from the failover server to the primary
BBPAGER server.


The network-test failover is fairly simple to do. I've attached two
scripts here, both of which must run on the backup/standby/failover
server:

1) failover.sh - goes in ~hobbit/server/ext/
  Add a section to hobbitlaunch.cfg with

     [failovercheck]
ENVFILE /usr/lib/hobbit/server/etc/hobbitserver.cfg
NEEDS hobbitd
CMD $BBHOME/ext/failover.sh 10.0.0.1 hobbitnet.mydom.com

  "10.0.0.1" is the IP of your primary Hobbit server,
  "hobbitnet.mydom.com" is the hostname (in the bb-hosts file) of the
  primary network test machine.

  What this does is that it queries the primary Hobbit server for how
  long ago the network tests were updated. If more than 7 minutes ago
  it deems the primary network test node to be DOWN, and flags this via
  the file $BBTMP/primarynetDOWN. If the network test update was less
  than 7 minutes ago, it removes the file.

  This is then used by the other script, which replaces the CMD in the
  "[bbnet]" section in hobbitlaunch.cfg.

2) failovernet.sh - goes in ~hobbit/server/ext/
  When this runs to do the normal network tests, it will check for the
  presence of the $BBTMP/primarynetDOWN file. If this file exists, it
  picks up the IP of the primary Hobbit server from the file, and
  modifies the settings to report data to both the normal (local)
  Hobbit server, and to the primary server. If the file does not exist,
  it will just run the network tests the normal way.
  So to run this, modify the [bbnet] section in hobbitlaunch.cfg and
  change the CMD setting to "$BBHOME/server/ext/failovernet.sh"


The alert failover is different, because Hobbit doesn't have a separate
BBPAGER server - alerts are sent from the same host that handles the
Hobbit data collection and webpages. A solution to this has been
implemented for the next release, where the alerting module can be
distributed onto multiple servers, but only one of them will send alerts
at any given time.


Regards,
Henrik

list Joe Sloan · Mon, 05 Nov 2007 10:21:11 -0800 ·
quoted from Josh Luthman
Josh Luthman wrote:
I really like this mailing list - you learn quite a bit even just by
reading the questions of others.  I'm the only system administrator
around here so I need a shoulder to lean on every once and a while!

Are you saying you've used BB so much that you know it too well and
need a replacement?  Sounds kind of backwards to me =P
Haha, what I mean is, big brother is showing it's age, and getting
harder to support as time goes by, simply by virtue of the fact that the
code is slowly rotting. But we can't just make a clean sweep - so much
now depends on the way big brother behaves, that the replacement needs
to be able to act identically to big brother.

Joe
list Josh Luthman · Mon, 5 Nov 2007 13:25:46 -0500 ·
Joe,

I see what you're saying now.  In your case Hobbit is an absolute perfect
match!  Hopefully we can all enjoy seeing your questions and see you
answering some of ours =)

Josh
quoted from Joe Sloan

On 11/5/07, Sloan <user-b1d2c84d244b@xymon.invalid> wrote:
Josh Luthman wrote:
I really like this mailing list - you learn quite a bit even just by
reading the questions of others.  I'm the only system administrator
around here so I need a shoulder to lean on every once and a while!

Are you saying you've used BB so much that you know it too well and
need a replacement?  Sounds kind of backwards to me =P
Haha, what I mean is, big brother is showing it's age, and getting
harder to support as time goes by, simply by virtue of the fact that the
code is slowly rotting. But we can't just make a clean sweep - so much
now depends on the way big brother behaves, that the replacement needs
to be able to act identically to big brother.

Joe

-- 
Josh Luthman
Office: XXX-XXX-XXXX
Direct: XXX-XXX-XXXX
XXXX Wayne St
Suite XXXX
Troy, OH XXXXX

Those who don't understand UNIX are condemned to reinvent it, poorly.
--- Henry Spencer
list Ralph Mitchell · Mon, 5 Nov 2007 12:43:49 -0600 ·
quoted from Josh Luthman
On 11/5/07, Sloan <user-b1d2c84d244b@xymon.invalid> wrote:
Haha, what I mean is, big brother is showing it's age, and getting
harder to support as time goes by, simply by virtue of the fact that the
code is slowly rotting. But we can't just make a clean sweep - so much
now depends on the way big brother behaves, that the replacement needs
to be able to act identically to big brother.
In your original email you said:
quoted from Josh Luthman

   "only one of the bb servers does reporting, as determined by the
failover state"

Does that mean only one of the two sends out notifications??  If so,
why not just have two identical Hobbit servers, with all the clients
sending reports to both, and disable the [bbpage] section in
server/etc/hobbitlaunch.cfg on the backup system.  Then your failover
solution only has to detect "death of twin" and rewrite the
hobbitlaunch.cfg with DISABLED removed from that section.  I guess
Hobbit might need to be reloaded or restarted for the change to take
effect.

I haven't tried this - we don't use the paging system - I'm just
wondering if something as simple as that might work.

Ralph Mitchell
list Joe Sloan · Mon, 05 Nov 2007 11:31:46 -0800 ·
quoted from Ralph Mitchell
Ralph Mitchell wrote:
In your original email you said:

   "only one of the bb servers does reporting, as determined by the
failover state"

Does that mean only one of the two sends out notifications??  If so,
why not just have two identical Hobbit servers, with all the clients
sending reports to both, and disable the [bbpage] section in
server/etc/hobbitlaunch.cfg on the backup system.  Then your failover
solution only has to detect "death of twin" and rewrite the
hobbitlaunch.cfg with DISABLED removed from that section.  I guess
Hobbit might need to be reloaded or restarted for the change to take
effect.

I haven't tried this - we don't use the paging system - I'm just
wondering if something as simple as that might work.
  
Yes, that approach has merit, it could work.

However I still need to give Henrik's quick solution from last week some
testing - I'm currently building a hobbit infrastructure so I can see
how his failover script behaves under pressure.

Joe
list Henrik Størner · Tue, 6 Nov 2007 07:32:14 +0100 ·
On Fri, Nov 02, 2007 at 02:48:32PM -0700, Sloan wrote:
Henrik Stoerner wrote:
The alert failover is different, because Hobbit doesn't have a separate
BBPAGER server - alerts are sent from the same host that handles the
Hobbit data collection and webpages. A solution to this has been
implemented for the next release, where the alerting module can be
distributed onto multiple servers, but only one of them will send alerts
at any given time.
So the distributed alerting capability will be in the
soon-to-be-released 4.3.0?
Yes.


Henrik
list Henrik Størner · Tue, 6 Nov 2007 07:42:05 +0100 ·
quoted from Ryan Jay B. Lapuz
On Sat, Nov 03, 2007 at 02:27:42PM +0800, Ryan Jay B. Lapuz wrote:
I just setup a backup server for our primary hobbit server yesterday that was inspired by this:
From Henrick:
I run two completely separate systems in parallel, and have the clients
report to both of them. The system at our disaster center has the paging
module disabled (just disable the [bbpage] section in hobbitlaunch.cfg),
to avoid double alerts - it is simple to activate it, if necessary. --  ((manual I think))
Now that is my current setup, however you just created a failover script which will make the failover transition automatic.The concept is just the same but,  is the clients will still report to both HB servers if there is no failover?
The failover script is used in a scenario where you have two servers
running the network tests - and ONLY the network tests, not the Hobbit
display - each reporting to their own Hobbit server.  If the primary network test server server goes down then you want the backup server to
automatically start feeding the test results to both servers. In that case it becomes necessary to have a failover from the primary to the backup
server to do the network tests, and that is what the script I posted
does.

But in both scenarios you'd probably want to have the "full picture" of
the health of your systems, so you do need to have the data from the
clients available, both on the primary and on the backup system.
Therefore the clients should be configured to send data to both systems.


Henrik
list Tom L. Stewart · Mon, 12 Nov 2007 16:41:49 -0600 ·
 
I tried to set up two dummy hosts use as a clone for critical systems.

The first one is for the conn test and I added a bunch of systems to
it(Conn_Host_P1).

That worked fine.

Next I created another clone master called something else
(Proc_Host_P1).
I used a system that was already set in (Conn_Host_P1), and I added it
to Proc_Host_P1. However, now the added system disappears from
Conn_Host_P1 as a clone and only appears in Proc_Host_P1.

Am I understanding that only 1 master clone item can exist? I wanted
master tests for conn and procs and a few other items.

I am using the web page Edit Critical Systems to do this.

The file permissions are set as hobbit owner and webxxx group in
server/etc/hobbit-nkview.cfg.

Any insight would be appreciated.

Tom
list Joe Sloan · Fri, 10 Apr 2009 21:45:18 -0700 ·
quoted from Henrik Størner
On 20071102, Henrik Stoerner wrote:
The alert failover is different, because Hobbit doesn't have a separate
BBPAGER server - alerts are sent from the same host that handles the
Hobbit data collection and webpages. A solution to this has been
implemented for the next release, where the alerting module can be
distributed onto multiple servers, but only one of them will send alerts
at any given time.
  
Could you elaborate on this alerting module configuration, or point me to TFM? If it has indeed been implemented, it sounds like it would be the key to enabling hobbit to emulate the big brother style alerting failover behavior.

Joe