Xymon Mailing List Archive search

Highlights of the 4.3.0 version

list Henrik Størner
Fri, 3 Aug 2007 21:53:14 +0200
Message-Id: <user-d5a968ae7a3d@xymon.invalid>

On Fri, Aug 03, 2007 at 01:15:27PM -0400, Scott Walters wrote:
I am definitely in the "monitor only" camp.
Me too. For those who feel differently, Hobbit does provide the
necessary hooks so you can trigger actions from some status going red;
either through alert scripts, or from the bb "query" command which
others have mentioned. In fact, I implemented the "query" feature
because I needed it to setup such an automated recovery for one of
our customers at work.
All of those "operational" aspects aside, I've convinced myself from a
security point of view, corrective action from monitoring is bad-- a
clear violation of the separation of duties.  You don't want your
auditors "cleaning up" the numbers as they go over your books.
Good point.
The question I have yet to answer satisfactorily is,"Should
the monitoring system perform additional data collection after
specific errors?"  For example, running a particular "find" command
when disk usage increases to try and identify which files are causing
the partition to fill.
It can be very useful at times, especially when you have to do a 
"root cause analysis" to explain why some service was down at 2 AM in
the morning - and the problem was fixed by a 2nd-level technician who 
just rebooted the box. That's why I added the feature that Hobbit saves
the latest client-data report when a status goes yellow or red. It has
helped me track down the cause of quite a few service outages.


Regards,
Henrik