Xymon Mailing List Archive search

grouping methods

list Josh Luthman
Mon, 16 Jun 2008 13:57:39 -0400
Message-Id: <user-e1af260e9531@xymon.invalid>

This is quite obviously a well found problem and sought after feature
- getting redundant Hobbit servers.

Please help us, code monkeys =)

Josh

On Mon, Jun 16, 2008 at 1:45 PM, Sloan <user-b1d2c84d244b@xymon.invalid> wrote:
Josh Luthman wrote:
Not sure what the real reasoning is behind this but if you have 1000
servers monitored behind 3 hobbit servers each, figure one Hobbit
server goes down you lost 1000/3000 being monitored.  If you have 3000
servers being monitored behind 1 hobbit server, that one point of
failure leaves you blind of all 3000 servers.
We do it with redundancy. Each server in our various data centers is
monitored by two bb servers, with one of the two set up to send
notifications, but in all other aspects the monitoring is active/active, and
we get only one notification for alerts, rather than a pair of redundant
notifications.

We've not had a bb server go down in all the years we've been using it, but
sometimes wan connectivity goes away due to circumstances beyond our
control, and a bb server in Arizona can't talk to the corresponding bb
server in California, so the normally passive monitoring server goes into
failover mode, and begins sending notification for alerts, since it can't
verify that the other bb server is alive.

Thus, we always receive notifications for all alerts, and in the worst case
we may get redundant notifications in the case of a split brain situation,
which is the lesser of the evils.

Once this notification failover capability makes it into hobbit, we can
finally switch from bb to hobbit.

Joe

-- 
Josh Luthman
Office: XXX-XXX-XXXX
Direct: XXX-XXX-XXXX
XXXX Wayne St
Suite XXXX
Troy, OH XXXXX

Those who don't understand UNIX are condemned to reinvent it, poorly.
--- Henry Spencer