Xymon Mailing List Archive search

Multiple hobbit (bbproxy and bb server) queries

list Ralph Mitchell
Tue, 7 Jul 2009 12:31:36 -0500
Message-Id: <user-1d15bc7d1a3e@xymon.invalid>

What version of hobbit/xymon are you running??  I used to have a problem
like that with 4.2.  No bbproxy involved there, just several hobbit servers.
 If one of them was down, the server/bin/bb command would hang trying to
talk to it.  It should have either failed to make the connect or timed out,
but it didn't.  Anyway...
You could set up a simple heartbeat script, a bit like this:

     #!/bin/bash

     $BB $BBDISP "status+2 bbproxyname.panicnow `date`
        If this is purple, a bbproxy died."

Set that up to launch every minute.  The message has a lifetime of 2
minutes, so it'll go purple about 3 minutes after the bbproxy hangs up or
dies.  You might want to pick a different column name.  :)

Ralph Mitchell


On Tue, Jul 7, 2009 at 8:40 AM, <user-c15424b7e83a@xymon.invalid> wrote:
Hi guys,

You may remember my questions from last week. Thanks again for these. I
have now implemented it however I have a few questions (and possibly bugs?).
I will first start by describing the setup. To keep things simple I'll call
each hobbit server IP as either "A" "B" or "C" depending on the data centre.

Data centre 1:
bbproxy and bbserver (running on same box, A). bbproxy configured to send
to B,C,A. bbserver configured to talk to A,B,C in hobbitserver.cfg.

Data centre 2:
bbproxy and bbserver (running on same box, B). bbproxy configured to send
to C,A,B. bbserver configured to talk to B,C,A in hobbitserver.cfg.

Data centre 3:
bbproxy and bbserver (running on same box, C). bbproxy configured to send
to A,B,C. bbserver configured to talk to C,A,B in hobbitserver.cfg.

Firstly, everything looks good. However, if I am to stop bbserver at A (but
keep bbproxy running at A) then shortly afterwards bbproxy at A will start
crashing. I've tried changing the order of --bbdisplays and it seems like
the bbproxy will crash if the last bbdisplay IP has been shutdown/not
available. Is this known, or is there a workaround?

To explain this better - Lets say bbproxy at site B is configured as
--bbdisplays=B,C,A. If I kill the xymon server at site A then this proxy
will crash shortly afterwards. Note I'm on x86 Solaris.


My other question is this - currently if a proxy crashes and the other 2
xymon servers do not receive updates, most tests continue to stay green. I'm
sure I've seen a configuration option but I can't seem to find it - can I
configure these tests to go purple if they don't receive an update within
the next 5-10 minutes? They've only just gone purple after 30 minutes, but
we really need to know within 5 minutes if we haven't received a valid
update.

Many thanks,
James.