pagetype: HOST alerting feature?
list Bruce Lysik
Hi, So I migrated one of our BB installations officially over to Hobbit today. Woohoo. So now the questions and comments from the other sysadmins start to come in. Bigbrother apparently has a paging setting where if all the hosts for a check fail, it will only page you on one of the checks. ie, a host's nic fails, so ping, http checks, snmp checks fail, but bigbrother would only page you about the ping check failing and preventing a deluge to your pager. Does hobbit have a way to do this? -- Bruce Z. Lysik <user-4e63a10f8934@xymon.invalid> Operations Engineer
list Bruce Lysik
▸
Bigbrother apparently has a paging setting where if all the hosts for a check fail, it will only page you on one of the checks.
Er, I meant: a paging setting where if all the checks for a host fail... -- Bruce Z. Lysik <user-4e63a10f8934@xymon.invalid> Operations Engineer
list Charles Jones
▸
Bruce Lysik wrote:
Bigbrother apparently has a paging setting where if all the hosts for a check fail, it will only page you on one of the checks.Er, I meant: a paging setting where if all the checks for a host fail...
depends=(testA:host1/test1,host2/test2),(testB:host3/test3),[...]
This tag allows you to define dependencies betweeen tests. If
"testA" for the current host depends on "test1" for host "host1" and
test "test2" for "host2", this can be defined with
depends=(testA:host1/test1,host2/test2)
When deciding the color to report for testA, if either host1/test1
failed or host2/test2 failed, if testA has failed also then the
color of testA will be "clear" instead of red or yellow.
Since all tests are actually run before the dependencies are
evaluated, you can use any host/test in the dependency - regardless
of the actual sequence that the hosts are listed, or the tests run.
It is also valid to use tests from the same host that the dependency
is for. E.g.
1.2.3.4 foo # http://foo/ webmin depends=(webmin:foo/http)
is valid; if both the http and the webmin tests fail, then webmin
will be reported as clear.
Note: The "depends" tag is evaluated on the BBNET server while
running the network tests. It can therefore only refer to other
network tests that are handled by the same BBNET server - there is
currently no way to use the e.g. the status of locally run tests
(disk, cpu, msgs) or network tests from other BBNET servers in a
dependency definition. Such dependencies are silently ignored.
list Bruce Lysik
<snip 'depends' snippet> Interesting. So I guess one way to do it would be to have all checks for a given host depend on that hosts conn check. So if I'm comprehending correctly, this should work: 1.2.3.4 foo # <http://foo/> http://foo/ smtp depends=(http:foo/conn),(smtp:foo/conn) In theory, the http and smtp checks would be clear if conn is red, and no alert would be sent for them. (While an alert would be sent for the red conn check.) That's pretty cool, but unfortunately doesn't solve the problem I've run into: I have some hostsA which all nfs mount from hostB. hostB dies. Now when hostsA bb clients run its disk check, the df hangs, which causes all those local bb checks to purple in hobbit. hobbit proceeds to send out five alerts per host (one for each local purple). In the past, bigbrother would only send out one alert per group of purples. Is there any way to roll-up alerts like this? -- Bruce Z. Lysik <user-4e63a10f8934@xymon.invalid> Operations Engineer
list Henrik Størner
▸
On Thu, Feb 17, 2005 at 05:01:53PM -0800, Bruce Lysik wrote:
So I migrated one of our BB installations officially over to Hobbit today. Woohoo. So now the questions and comments from the other sysadmins start to come in.
Oh my god ... they're actually *using* the darn thing :-)
▸
Bigbrother apparently has a paging setting where if all the hosts for a check fail, it will only page you on one of the checks. ie, a host's nic fails, so ping, http checks, snmp checks fail, but bigbrother would only page you about the ping check failing and preventing a deluge to your pager. Does hobbit have a way to do this?
It should do that "out of the box". Hobbit's network tester mimics the way BB does network tests, so if the ping-test fails the "conn" column will be red, but the other network tests for that host will go "clear". And the "clear" color normally does not trigger a page. The "depends" setting mentioned here is for dependencies between hosts, e.g. if your webserver needs an application server to be up before it can send a response. Regards, Henrik
list Henrik Størner
▸
On Thu, Feb 17, 2005 at 06:12:41PM -0800, Bruce Lysik wrote:
I have some hostsA which all nfs mount from hostB. hostB dies. Now when hostsA bb clients run its disk check, the df hangs, which causes all those local bb checks to purple in hobbit. hobbit proceeds to send out five alerts per host (one for each local purple).
OK, that's pretty annoying.
▸
In the past, bigbrother would only send out one alert per group of purples. Is there any way to roll-up alerts like this?
Did BB really have this ? I didn't know that. Hobbit doesn't support it - sorry. I think it might be worthwhile to look at doing "alert-merging" more generally, e.g. so you'll get all alerts for a given recipient merged into one message based on some criteria. But that's for a future version. Henrik