Xymon Mailing List Archive search

Suitability for bb replacement in large enterprise

7 messages in this thread

list Joe Sloan · Sat, 06 Dec 2008 13:25:34 -0800 ·
Hello list,

I learned about hobbit a couple years back, and I've been using it in some small shops, but I'm hoping to be able to deploy it for my main employer.

At my day job, we've been using the old big brother 1.9e plus bbgen-3.5 to monitor hundreds of unix and windows servers spread across 2 data centers. Id like to move to hobbit, but it is important that it be able to do what bb does. For the most part it is better than bb, but I have 2 specific areas of concern:

1. snmp traps: Our netcool ticketing system relies on snmp traps from big brother whenever there is a significant event.

Are any xymon users currently using snmp traps in a similar sort of way?

2. alerting failover: We currently have active-active monitoring of all our systems. That is, the bb servers in both data centers redundantly monitor all the servers, but it would be a nuisance to get 2 pages for every event, so we use the bb failover setup so that normally only the bb server in data center "A" sends alerts, but if the bb server in data center "B" can't reach the bb server in data center "A", then it goes into failover mode and sends the alerts on behalf of the unreachable bb server in data center "A". The worst case would be some sort of split brain where we get 2 alerts, but we have not seen that scenario arise, and it works well.

Is xymon 4.2.2 capable of this sort of failover behavior?

I'm hoping to replace bb with xymon, but these 2 items are deal breakers.

Thanks in advance for your words of wisdom,

Joe
list Dominique Frise · Sun, 07 Dec 2008 11:46:14 +0100 ·
quoted from Joe Sloan
Joe Sloan wrote:
Hello list,

I learned about hobbit a couple years back, and I've been using it in some small shops, but I'm hoping to be able to deploy it for my main employer.

At my day job, we've been using the old big brother 1.9e plus bbgen-3.5 to monitor hundreds of unix and windows servers spread across 2 data centers. Id like to move to hobbit, but it is important that it be able to do what bb does. For the most part it is better than bb, but I have 2 specific areas of concern:

1. snmp traps: Our netcool ticketing system relies on snmp traps from big brother whenever there is a significant event.
We use the elegant method from Andy Farrior : http://cerebro.victoriacollege.edu/hobbit-trap.html


Dominique
UNIL - University of Lausanne
list Joe Sloan · Sun, 07 Dec 2008 09:48:06 -0800 ·
quoted from Dominique Frise
Dominique Frise wrote:
Joe Sloan wrote:
1. snmp traps: Our netcool ticketing system relies on snmp traps from 
big brother whenever there is a significant event.
We use the elegant method from Andy Farrior : 
http://cerebro.victoriacollege.edu/hobbit-trap.html
Interesting, thanks for the reference -

Joe
list Buchan Milne · Mon, 8 Dec 2008 16:08:21 +0200 ·
quoted from Dominique Frise
On Sunday 07 December 2008 12:46:14 Dominique Frise wrote:
Joe Sloan wrote:
Hello list,

I learned about hobbit a couple years back, and I've been using it in
some small shops, but I'm hoping to be able to deploy it for my main
employer.

At my day job, we've been using the old big brother 1.9e plus bbgen-3.5
to monitor hundreds of unix and windows servers spread across 2 data
centers. Id like to move to hobbit, but it is important that it be able
to do what bb does. For the most part it is better than bb, but I have 2
specific areas of concern:

1. snmp traps: Our netcool ticketing system relies on snmp traps from
big brother whenever there is a significant event.
We use the elegant method from Andy Farrior :
http://cerebro.victoriacollege.edu/hobbit-trap.html
This is the wrong way around though, it adds support for Hobbit reporting on 
snmp traps sent by other devices, not for sending SNMP traps for alerts.

However, it should be relatively easy to make an alerting script to plug into 
Hobbit to send traps.

The biggest requirement for someone to be able to test such a script 
compatible with the BB feature would be the MIB file used by BB ...

Regards,
Buchan
list Henrik Størner · Mon, 8 Dec 2008 21:56:08 +0000 (UTC) ·
quoted from Buchan Milne
In <user-e9442dd849b2@xymon.invalid> Buchan Milne <user-9b139aff4dec@xymon.invalid> writes:
However, it should be relatively easy to make an alerting script to plug into 
Hobbit to send traps.
The biggest requirement for someone to be able to test such a script 
compatible with the BB feature would be the MIB file used by BB ...
As far as I recall, BB just sends a trap message using an OID
that's been configured in the alert-setup config. A trap is really
just a plain text-string wrapped into an SNMP message.

So there's no MIB involved, other than the "MIB" which has only
the one OID which is used for the trap.


Regards,
Henrik
list Joe Sloan · Mon, 08 Dec 2008 14:54:39 -0800 ·
quoted from Henrik Størner
Henrik Størner wrote:
In <user-e9442dd849b2@xymon.invalid> Buchan Milne <user-9b139aff4dec@xymon.invalid> writes:

  
However, it should be relatively easy to make an alerting script to plug into 
Hobbit to send traps.
    
The biggest requirement for someone to be able to test such a script 
compatible with the BB feature would be the MIB file used by BB ...
    
As far as I recall, BB just sends a trap message using an OID
that's been configured in the alert-setup config. A trap is really
just a plain text-string wrapped into an SNMP message.

So there's no MIB involved, other than the "MIB" which has only
the one OID which is used for the trap.

Yes, it's very simple and straightforward: send out an snmp trap based
on an event, per the bbwarnrules.cfg, just like the email alerts. If
hobbit can be made to do that, one of the two key requirements would be
met, and only the failover behaviour would still need to be resolved.

Joe
list Henrik Størner · Tue, 9 Dec 2008 06:57:41 +0000 (UTC) ·
quoted from Joe Sloan
In <user-0d7933c91e98@xymon.invalid> J Sloan <user-b1d2c84d244b@xymon.invalid> writes:
Yes, it's very simple and straightforward: send out an snmp trap based
on an event, per the bbwarnrules.cfg, just like the email alerts. If
hobbit can be made to do that, one of the two key requirements would be
met, and only the failover behaviour would still need to be resolved.
In hobbit-alerts.cfg you'd have something like

    HOST=* TEST=disk
        SCRIPT /usr/local/bin/trapmessage 0

Then your /usr/local/bin/trapmessage would do the trap-sending.
E.g. using Net-SNMP tools and the defaults from BB's bbwarnsetup.cfg:
 
    #!/bin/sh

    OID="enterprises.7058"    # 7058 is the Big Brother OID
    SNMPSTATION="10.3.2.1"    # IP of your monitoring system

    # BB maps a service to a numeric trapcode. See the
    # 'trapcodes' definition in bbwarnsetup.cfg
    # Add those you use here.
    case "$BBSVCNAME" in
      "disk") 
          TRAPCODE="2"
	  ;;
      "cpu")
          TRAPCODE="4"
	  ;;
    esac
    # ... and adds 1 if the status has recovered
    if test "$RECOVERED" = "1"
    then
       TRAPCODE=`expr $TRAPCODE + 1`
    fi

    snmptrap -v1 -c public $SNMPSTATION $OID \
       $BBHOSTNAME 6 $TRAPCODE '' $OID s "$BBALPHAMSG"

    exit 0


Regards,
Henrik