Xymon Mailing List Archive search

XYMON - corrective measures?

7 messages in this thread

list Tom S · Fri, 23 Dec 2011 15:01:47 -0500 ·
It would be nice to have some way to have XYMON do 'corrective measures'
for an alert.
example: if Apache was  hanging yet the connection to a client system was
OK, to log on with ssh and restart Apache2.

is there a way to implement this easily?

TIA!
list Steven Carr · Fri, 23 Dec 2011 20:07:45 +0000 ·
Take a look at the alerts.cfg man page...
http://www.xymon.com/xymon/help/manpages/man5/alerts.cfg.5.html

You can configure an alert to trigger a custom script, that script could
then run off and do whatever you need it to do.

Steve
quoted from Tom S


On 23 December 2011 20:01, Tom S <user-428671d28ab5@xymon.invalid> wrote:
It would be nice to have some way to have XYMON do 'corrective measures'
for an alert.
example: if Apache was  hanging yet the connection to a client system was
OK, to log on with ssh and restart Apache2.

is there a way to implement this easily?

TIA!

list Tom S · Fri, 23 Dec 2011 16:57:25 -0500 ·
Thanks.

So this will work?: (example in alerts.cfg)
HOST=www.foo.com SERVICE=http
MAIL user-d0a7de99275f@xymon.invalid DURATION>2 COLOR=red
SCRIPT /usr/local/bin/restartapache.sh

and in restartapache.sh I would have the following:
#/bin/bash
ssh xymon at web1-server 'sudo /etc/init.d/apache2 restart'
exit
(considering xymon is in the sudoers list)

How can I make sure that the 'SCRIPT /usr/local/bin/restartapache.sh' is
only run once per alert in case of network issues (connecting to port 80)
it doesn't keep trying to ssh and restarting the server every few minutes?

TIA
quoted from Steven Carr


On Fri, Dec 23, 2011 at 3:07 PM, Steven Carr <user-923b20c0d620@xymon.invalid> wrote:
Take a look at the alerts.cfg man page...
http://www.xymon.com/xymon/help/manpages/man5/alerts.cfg.5.html

You can configure an alert to trigger a custom script, that script could
then run off and do whatever you need it to do.

Steve


On 23 December 2011 20:01, Tom S <user-428671d28ab5@xymon.invalid> wrote:
It would be nice to have some way to have XYMON do 'corrective measures'
for an alert.
example: if Apache was  hanging yet the connection to a client system was
OK, to log on with ssh and restart Apache2.

is there a way to implement this easily?

TIA!

list Henrik Størner · Fri, 23 Dec 2011 23:05:57 +0100 ·
On 23-12-2011 22:57, Tom S wrote:
Thanks.

So this will work?: (example in alerts.cfg)
HOST=www.foo.com <http://www.foo.com>; SERVICE=http
MAIL user-d0a7de99275f@xymon.invalid <mailto:user-d0a7de99275f@xymon.invalid> DURATION>2 COLOR=red
SCRIPT /usr/local/bin/restartapache.sh
Check the alerts.cfg man-page, you need a "recipient" on the SCRIPT 
command - if you don't use it, just put dummy text after the command.
quoted from Tom S
and in restartapache.sh I would have the following:
#/bin/bash
ssh xymon at web1-server 'sudo /etc/init.d/apache2 restart'
exit
(considering xymon is in the sudoers list)

How can I make sure that the 'SCRIPT /usr/local/bin/restartapache.sh' is
only run once per alert in case of network issues (connecting to port
80) it doesn't keep trying to ssh and restarting the server every few
minutes?
Use REPEAT to limit it to e.g. once every 24 hours. Note that the 
repeat-setting gets reset once the alert clears, so it will work OK even 
if the server goes down at 10 AM, comes back up, and then goes down 
again at 7 PM; in both cases the script will be called.


Regards,
Henrik
list Tom S · Fri, 23 Dec 2011 17:17:35 -0500 ·
Thanks Henrik,

So this should do it?
HOST=www.foo.com <http://www.foo.com>; SERVICE=http
MAIL user-d0a7de99275f@xymon.invalid <mailto:user-d0a7de99275f@xymon.invalid> DURATION>2 COLOR=red
SCRIPT /usr/local/bin/restartapache.sh 123456789 REPEAT 1440

That above will email user-d0a7de99275f@xymon.invalid after 2 minutes of RED
It will also call up /usr/local/bin/restartapache.sh and run it once every
24 hours if it's down that long?

Do I need to put in a DURATION on that one also or does it keep the 2
minutes from the above line or does it run it as soon as it see's it's
red?  Can I put a DURATION on that also? (eg. SCRIPT
/usr/local/bin/restartapache.sh 123456789 REPEAT 1440 DURATION>2 )

thanks!!
quoted from Henrik Størner
On Fri, Dec 23, 2011 at 5:05 PM, Henrik Størner <user-ce4a2c883f75@xymon.invalid> wrote:
On 23-12-2011 22:57, Tom S wrote:
Thanks.

So this will work?: (example in alerts.cfg)
HOST=www.foo.com <http://www.foo.com>; SERVICE=http
MAIL user-d0a7de99275f@xymon.invalid <mailto:user-d0a7de99275f@xymon.invalid> DURATION>2 COLOR=red
SCRIPT /usr/local/bin/restartapache.**sh
Check the alerts.cfg man-page, you need a "recipient" on the SCRIPT
command - if you don't use it, just put dummy text after the command.


 and in restartapache.sh I would have the following:
#/bin/bash
ssh xymon at web1-server 'sudo /etc/init.d/apache2 restart'
exit
(considering xymon is in the sudoers list)

How can I make sure that the 'SCRIPT /usr/local/bin/restartapache.**sh'
is
only run once per alert in case of network issues (connecting to port
80) it doesn't keep trying to ssh and restarting the server every few
minutes?
Use REPEAT to limit it to e.g. once every 24 hours. Note that the
repeat-setting gets reset once the alert clears, so it will work OK even if
the server goes down at 10 AM, comes back up, and then goes down again at 7
PM; in both cases the script will be called.


Regards,
Henrik

______________________________**

Xymon at xymon.com<
list Henrik Størner · Fri, 23 Dec 2011 23:21:03 +0100 ·
On 23-12-2011 23:17, Tom S wrote:
Thanks Henrik,

So this should do it?
Yes.
HOST=www.foo.com <http://www.foo.com/>; <http://www.foo.com
<http://www.foo.com/>>; SERVICE=http
MAIL user-d0a7de99275f@xymon.invalid <mailto:user-d0a7de99275f@xymon.invalid> <mailto:user-d0a7de99275f@xymon.invalid
<mailto:user-d0a7de99275f@xymon.invalid>> DURATION>2 COLOR=red
SCRIPT /usr/local/bin/restartapache.sh 123456789 REPEAT 1440

That above will email user-d0a7de99275f@xymon.invalid <mailto:user-d0a7de99275f@xymon.invalid> after 2 minutes
quoted from Tom S
of RED
It will also call up /usr/local/bin/restartapache.sh and run it once
every 24 hours if it's down that long?
Yes.
quoted from Tom S
Do I need to put in a DURATION on that one also or does it keep the 2
minutes from the above line or does it run it as soon as it see's it's
red?  Can I put a DURATION on that also? (eg. SCRIPT
/usr/local/bin/restartapache.sh 123456789 REPEAT 1440 DURATION>2 )
When you put the DURATION setting on the MAIL or SCRIPT line, it is 
local to that recipient, so You can add a DURATION on the SCRIPT also - 
either the same, or different.

You can also put it on the HOST+SERVICE line, in which case it will be 
the default for all of the recipients.


Regards,
Henrik
list Bruce White · Fri, 23 Dec 2011 23:05:06 -0600 ·
I have custom scripts running on a client system which  do the actual monitoring; send the status updates to Xymon; and if they detect an outage, the script itself performs the corrective action.   You can accomplish this via a properly configured sudo, and granting the needed rights to the Xymon user running the client.  
You can also prevent repeated attempts to restart something by having the script create a restart file, which is deleted the next pass of monitoring which finds a "green" condition. 
 If a  down condition is encountered, and the restart file does not exist, then perform the restart and create the restart file  (I usually also set the status to "yellow" so the restart attempt is captured in the Xymon history).  If on the next pass, the down condition still exists and the restart file also exists, then I set the status to red which then triggers a page, so a human can review the situation. Finally, whenever  the script finds a "green" condition, it always checks for the "restart" file and deletes it if found.

Hope that helps,
Bruce


 
 Bruce White
 Senior Enterprise Systems Engineer | Phone: X-XXX-XXX-XXXX | Fax: XXX-XXX-XXXX | user-58f975e8bf9d@xymon.invalid | http://www.fellowes.com/
 
 
 
Disclaimer: The information contained in this message may be privileged and confidential and protected from disclosure. If the reader of this message is not the intended recipient or an employee or agent responsible for delivering this message to the intended recipient, you are hereby notified that any dissemination, distribution or copying of this communication is strictly prohibited. If you have received this communication in error, please notify us immediately by replying to the message and deleting it from your computer. Thank you. Fellowes, Inc.
quoted from Henrik Størner
 
-----Original Message-----
From: xymon-bounces at xymon.com [mailto:xymon-bounces at xymon.com] On Behalf Of Henrik Størner
Sent: Friday, December 23, 2011 4:21 PM
To: xymon at xymon.com
Subject: Re: [Xymon] XYMON - corrective measures?

On 23-12-2011 23:17, Tom S wrote:
Thanks Henrik,

So this should do it?
Yes.
HOST=www.foo.com <http://www.foo.com/>; <http://www.foo.com <http://www.foo.com/>>; SERVICE=http MAIL user-d0a7de99275f@xymon.invalid <mailto:user-d0a7de99275f@xymon.invalid> <mailto:user-d0a7de99275f@xymon.invalid <mailto:user-d0a7de99275f@xymon.invalid>> DURATION>2 COLOR=red SCRIPT /usr/local/bin/restartapache.sh 123456789 REPEAT 1440

That above will email user-d0a7de99275f@xymon.invalid <mailto:user-d0a7de99275f@xymon.invalid> after 2 minutes of RED It will also call up /usr/local/bin/restartapache.sh and run it once every 24 hours if it's down that long?
Yes.
Do I need to put in a DURATION on that one also or does it keep the 2 minutes from the above line or does it run it as soon as it see's it's red?  Can I put a DURATION on that also? (eg. SCRIPT /usr/local/bin/restartapache.sh 123456789 REPEAT 1440 DURATION>2 )
When you put the DURATION setting on the MAIL or SCRIPT line, it is local to that recipient, so You can add a DURATION on the SCRIPT also - either the same, or different.

You can also put it on the HOST+SERVICE line, in which case it will be the default for all of the recipients.


Regards,
Henrik