Xymon Mailing List Archive search

proc : multiple alerts?

list Ralph Mitchell
Mon, 7 Oct 2013 07:25:12 -0400
Message-Id: <user-7461b9859e8a@xymon.invalid>

Speaking of reusabe code - it might be possible to re-purpose Jeremy
Laidman's fs-test:

https://wiki.xymonton.org/doku.php/monitors:fs-test

It looks at the "disk" column and creates new columns for any filesystems
that show non-green.  Once the "disk" column goes green again, it deletes
the extra columns. You could adapt that to look in the procs list and
create new columns that group system procs in one place, dba procs in
another, etc.

Ralph Mitchell


On Mon, Oct 7, 2013 at 7:12 AM, Betsy Schwartz <user-c61747246f66@xymon.invalid>wrote:
BUT – if the procs-column is already in the red-state, when the 2nd
process goes bad, it wont trigger a notification…
That depends on your rules. If you have procs set to alert every five
minutes while it is red, you will get repeated notifications. However if
someone *acks* or signs it out, you wont.

There may be another way to slice this:
   -if one of these  services is listening on a port, you can do a custom
ports test in protocols.cfg and alert on that
   -if one of these services is writing a log file or moving files around,
use a files or msgs test
   -if there's a web service, use the CONT= feature to create a named http
test and create an alert on that

Depending on the size of your shop , you may be able to sidestep another
way. In our case, it turns out that giving the NOC privileges to run sudo
/sbin/service along with some debugging documentation cut out a lot of
pages all around :-)   I did end up doing a custom test in one case because
testing on /sbin/service status was deemed to be preferable to just looking
for the process . If you have to do a custom test for a proc, at least it's
a very short test.

Once you've written "do a system call and alert on results" you should
have a fairly generic and reusable piece of code.