I have used this method with great success, but it is a pain in the
you-know-what to maintain. It would be nice if this "router" tagging
could be made recursive so you only have to specify one upstream host
for each host, assuming that the upstream host is also in Hobbit. As it
is today you have to specify the full path to each "leaf" and this can
get long.
GLH
-----Original Message-----
From: Rich Smrcina [mailto:user-cf452ff334e0@xymon.invalid]
Sent: Monday, June 16, 2008 1:18 PM
To: user-ae9b8668bcde@xymon.invalid
Subject: Re: [hobbit] grouping methods
If this is a situation of routed networks, Hobbit can know about that
with directives in the bb-hosts file. If it knows a host behind a
router is down, it will only notify for the router, not the hosts behind
the router.
Linder, Doug (SABIC Innovative Plastics, consultant) wrote:
Sloan [mailto:user-b1d2c84d244b@xymon.invalid] wrote:
We've not had a bb server go down in all the years we've been using
it, but sometimes wan connectivity goes away due to circumstances
beyond our control
This is by far the biggest annoyance we have with all system
monitoring
- when networks go down. It's a problem with every monitoring tool
there is and I can't think of any way to solve it: the monitoring
system has no way of knowing whether a system is down because it
crashed or if it's down because the network went down. All it knows
is that it can't talk to the system anymore and something is wrong, so
it generates an alert. When a whole network goes down, it can become
hundreds of simultaneous alerts. And that's annoying enough when it's
just email alerts. When you use Hobbit to generate cases in your
trouble ticket system, that can be hundreds of new, useless cases to
manually close.
We don't want to raise the amount of time a system has to be down
before Hobbit generates an alert, because we want to know as soon as
possible.
But if we keep that number too low, then when the network has a brief
hiccup, we get hundreds of redundant cases. This is especially a
problem with overseas networks on the WAN.
I think the only possible solution would be for Hobbit to have some
kind of flood-detection routine built in, where it could tell how
rapidly it was sending alerts about connection problems for machines
all on the same network, and was smart enough to think "Whoa, I'm
about to send 100 connection alarms about systems on the same
network.... Instead of sending 100 of them, maybe I'll just send ONE
alert saying "You got a big problem here."
Doug Linder
--
Rich Smrcina
VM Assist, Inc.
Phone: XXX-XXX-XXXX
Ans Service: XXX-XXX-XXXX
user-61add9955ef9@xymon.invalid
http://www.linkedin.com/in/richsmrcina
Catch the WAVV! http://www.wavv.org
WAVV 2009 - Orlando, FL - May 15-19, 2009