Xymon Mailing List Archive search

throttling alerts

3 messages in this thread

list Rob McBroom · Fri, 18 Jun 2010 09:35:09 -0400 ·
From time to time, something will happen that causes Xymon to lose its mind. (For instance, if the LDAP server can’t be reached and the Xymon server is unable to lookup the username/UID of the “xymon” user.) I understand why it’s happening and I don’t think Xymon is doing anything wrong here, but these events generate hundreds of e-mails and dozens of SMS messages, so I’m looking for ways to control it.

Is there a way to configure alerting with a rule like this: If you see 5 systems lose connection in a 10 second period, it’s obviously not true so shut up about it already.

Maybe I should be addressing this at the MTA level. If so, any tips there? (I’m using Postfix.)

-- 
Rob McBroom
<http://www.skurfer.com/>;
list Larry Barber · Fri, 18 Jun 2010 08:57:39 -0500 ·
Look at the "depends" tag on the bb-hosts man page.

Thanks,
Larry Barber
quoted from Rob McBroom

On Fri, Jun 18, 2010 at 8:35 AM, Rob McBroom <user-371ba9bb5b75@xymon.invalid>wrote:
From time to time, something will happen that causes Xymon to lose its
mind. (For instance, if the LDAP server can’t be reached and the Xymon
server is unable to lookup the username/UID of the “xymon” user.) I
understand why it’s happening and I don’t think Xymon is doing anything
wrong here, but these events generate hundreds of e-mails and dozens of SMS
messages, so I’m looking for ways to control it.

Is there a way to configure alerting with a rule like this: If you see 5
systems lose connection in a 10 second period, it’s obviously not true so
shut up about it already.

Maybe I should be addressing this at the MTA level. If so, any tips there?
(I’m using Postfix.)

--
Rob McBroom
<http://www.skurfer.com/>;

list TJ Yang · Fri, 18 Jun 2010 10:21:44 -0500 ·
Hi,Rob
quoted from Rob McBroom

On Fri, Jun 18, 2010 at 8:35 AM, Rob McBroom <user-371ba9bb5b75@xymon.invalid> wrote:
From time to time, something will happen that causes Xymon to lose its mind. (For instance, if the LDAP server can’t be reached and the Xymon server is unable to lookup the username/UID of the “xymon” user.) I understand why it’s happening and I don’t think Xymon is doing anything wrong here, but these events generate hundreds of e-mails and dozens of SMS messages, so I’m looking for ways to control it.

Is there a way to configure alerting with a rule like this: If you see 5 systems lose connection in a 10 second period, it’s obviously not true so shut up about it already.

Maybe I should be addressing this at the MTA level. If so, any tips there? (I’m using Postfix.)
Currently, to my knowledge, hobbit alert daemon has no concept of
controlling alerts volume going out.
This should be on Xymon's development roadmap. But for now, to add
this feature without touching C source code, MTA level of solution to
control Email(include pages using email address) is a feasible
solution.

I thought about the approach of using postfix+spamassassin also But
later I am more interested to go with approach of writing a Perl
script to do alerts throttling. This script will be the only script in
hobbit-alerts.cfg.
I haven't got a chance to implement this idea yet.


tj
--
Rob McBroom
<http://www.skurfer.com/>;

-- 
T.J. Yang