bbfix functionality in hobbit
list Gary B.
Is there any method with hobbit to have the client automatically restart services, like bbfix does for BB? We would like to be able to restart services that tend to fail often (such as SSH tunnels) automatically through Hobbit. Without writing a custom external script, I can't seem to find any information about doing this. Thanks
list Jim Smith
My personal opinion is that having Hobbit (or Big Brother) take action other than sending an alert goes beyond the scope of what it is intended for...as a monitoring system. I had thought about the same thing at one point, but was talked out of it.
▸
-----Original Message-----
From: Gary B. [mailto:user-33b796116d5f@xymon.invalid]
Sent: Friday, July 21, 2006 9:22 AM
To: user-ae9b8668bcde@xymon.invalid
Subject: [hobbit] bbfix functionality in hobbit
Is there any method with hobbit to have the client automatically
restart services, like bbfix does for BB? We would like to be able to
restart services that tend to fail often (such as SSH tunnels)
automatically through Hobbit. Without writing a custom external
script, I can't seem to find any information about doing this.
Thanks
NOTICE: This email contains confidential or proprietary information which may be legally privileged. It is intended only for the named recipient(s). If an addressing error has misdirected the email, please notify the author by replying to this message. If you are not the named recipient, you are not authorized to use, disclose, distribute, copy, print or rely on this email, and should immediately delete it from your computer system.
list Larry Barber
You could just put the program into clientlaunch.cfg. Hobbit restarts those programs automatically when they die. Whether or not this is a good idea I'll leave for you to decide. Thanks, Larry Barber
▸
On 7/21/06, Gary B. <user-33b796116d5f@xymon.invalid> wrote:Is there any method with hobbit to have the client automatically restart services, like bbfix does for BB? We would like to be able to restart services that tend to fail often (such as SSH tunnels) automatically through Hobbit. Without writing a custom external script, I can't seem to find any information about doing this. Thanks
list Gary B.
▸
My personal opinion is that having Hobbit (or Big Brother) take action other than sending an alert goes beyond the scope of what it is intended for...as a monitoring system. I had thought about the same thing at one point, but was talked out of it.
I would tend to agree myself, except 1) some people are asking about such functionality, and 2) it would be nice during oncall weeks to not have to get up and restart processes such as procallator, etc. I would be interested in hearing what talked you out of this, however. If nothing else, I can use it to talk other people out of it ;-)
▸
-----Original Message----- From: Gary B. [mailto:user-33b796116d5f@xymon.invalid] Sent: Friday, July 21, 2006 9:22 AM To: user-ae9b8668bcde@xymon.invalid Subject: [hobbit] bbfix functionality in hobbit Is there any method with hobbit to have the client automatically restart services, like bbfix does for BB? We would like to be able to restart services that tend to fail often (such as SSH tunnels) automatically through Hobbit. Without writing a custom external script, I can't seem to find any information about doing this. Thanks NOTICE: This email contains confidential or proprietary information which may be legally privileged. It is intended only for the named recipient(s). If an addressing error has misdirected the email, please notify the author by replying to this message. If you are not the named recipient, you are not authorized to use, disclose, distribute, copy, print or rely on this email, and should immediately delete it from your computer system.
list Nicolas Dorfsman
What about using SCRIPT facility of alert module ? If e-mail is not enough The MAIL keyword means that the alert is sent in an e-mail. Sometimes this ends up being an SMS to your cell-phone - there are several "e- mail to SMS" gateways that perform this service - but that may not be what you want to do. And also, for an e-mail to actually be delivered requires that the mail-server is working. So if you need full control over how alerts are handled, you can use the SCRIPT method instead. Here's how: HOST=%(www|intranet|support|mail).foo.com SERVICE=http SCRIPT /usr/local/bin/smsalert 4538761925 FORMAT=sms
list Jim Smith
Basically, it was just the scope argument. And I know all about on-call weeks...my one backup-analyst and I rotate weeks so I'm on-call half the year! But my first focus would be to attack the issue on its home server. For instance, if it is a Unix server I would investigate creating some kind of script that could be put in /etc/inittab to restart a dead process. If all else fails, however, you gotta do what you gotta do to get some sleep at night!
▸
-----Original Message-----
From: Gary B. [mailto:user-33b796116d5f@xymon.invalid]
Sent: Friday, July 21, 2006 9:33 AM
To: user-ae9b8668bcde@xymon.invalid
Subject: Re: [hobbit] bbfix functionality in hobbit
My personal opinion is that having Hobbit (or Big Brother) take action other than sending an alert goes beyond the scope of what it is intended for...as a monitoring system. I had thought about the same thing at one point, but was talked out of it.
I would tend to agree myself, except 1) some people are asking about such functionality, and 2) it would be nice during oncall weeks to not have to get up and restart processes such as procallator, etc. I would be interested in hearing what talked you out of this, however. If nothing else, I can use it to talk other people out of it ;-)
-----Original Message----- From: Gary B. [mailto:user-33b796116d5f@xymon.invalid] Sent: Friday, July 21, 2006 9:22 AM To: user-ae9b8668bcde@xymon.invalid Subject: [hobbit] bbfix functionality in hobbit Is there any method with hobbit to have the client automatically restart services, like bbfix does for BB? We would like to be able to restart services that tend to fail often (such as SSH tunnels) automatically through Hobbit. Without writing a custom external script, I can't seem to find any information about doing this. Thanks NOTICE: This email contains confidential or proprietary information which may be legally privileged. It is intended only for the named recipient(s). If an addressing error has misdirected the email, please notify the author by replying to this message. If you are not the named recipient, you are not authorized to use, disclose, distribute, copy, print or rely on this email, and should immediately delete it from your computer system.
NOTICE: This email contains confidential or proprietary information which may be legally privileged. It is intended only for the named recipient(s). If an addressing error has misdirected the email, please notify the author by replying to this message. If you are not the named recipient, you are not authorized to use, disclose, distribute, copy, print or rely on this email, and should immediately delete it from your computer system.
list Tom Georgoulias
▸
Gary B. wrote:
My personal opinion is that having Hobbit (or Big Brother) take action other than sending an alert goes beyond the scope of what it is intended for...as a monitoring system. I had thought about the same thing at one point, but was talked out of it.I would tend to agree myself, except 1) some people are asking about such functionality, and 2) it would be nice during oncall weeks to not have to get up and restart processes such as procallator, etc. I would be interested in hearing what talked you out of this, however. If nothing else, I can use it to talk other people out of it ;-)
I've thought about this kind of functionality a lot and my co-workers and I have discussed it several times before, and there are definitely pros and cons. However, it would be cool to have a module (kinda like the NCV one that gives end users a lot of flexibility) that allowed Hobbit to execute some script/command when an alert is triggered. Whether that script/command then collects more debugging info about the alert, tries to repair the system, or opens a ticket with a ticketing system automatically would then be up to the end user. That might keep Henrik out of a lot of scope creep. :) Tom
list Henrik Størner
▸
On Fri, Jul 21, 2006 at 10:21:40AM -0400, Gary B. wrote:
Is there any method with hobbit to have the client automatically restart services, like bbfix does for BB? We would like to be able to restart services that tend to fail often (such as SSH tunnels) automatically through Hobbit. Without writing a custom external script, I can't seem to find any information about doing this.
You will need to do some scripting, no doubt about that.
Whether it's a good idea or not ... it depends. From the Hobbit
"design" perspective (whoa - that sounds expensive) I have a very
firm belief that Hobbit should *monitor* things, not *fix* them.
I have seen far too many "intelligent" systems get in the way of
real problem-fixing because "intelligent" systems are usually
pretty dumb, and cannot handle anything out of the ordinary. When
they try, they often fail in spectacular ways.
And having things happen behind your back - because you forgot about
that little automatic script someone else setup 2 years ago - is just
plain frustrating.
With that little sermon as introduction, here's what you can do.
On the host(s) where you want to restart these services, write a
script to query the Hobbit server for the status of the service. If it's
red, do the restart. You can use the Hobbit "query" command to tell
what status the service has. E.g. if you want to reset the SSH tunnels
when the "tunnels" status goes red, then this little script run from
the Hobbit client's clientlaunch.cfg would do it:
#!/bin/sh
TUNNELSTATUS=`$BB $BBDISP "query $MACHINE.tunnels"|awk '{print $1}'`
if test "$TUNNELSTATUS" = "red"; then
sudo /etc/init.d/sshtunnels stop
sleep 5
sudo /etc/init.d/sshtunnels start
echo "`date`: SSH tunnels restarted"
fi
exit 0
Regards,
Henrik
list Gary B.
▸
Is there any method with hobbit to have the client automatically restart services, like bbfix does for BB? We would like to be able to restart services that tend to fail often (such as SSH tunnels) automatically through Hobbit. Without writing a custom external script, I can't seem to find any information about doing this.
Thanks all for your responses. It has provided me with a lot more information that I could have hoped ;-)
list Steve Aiello
I agree completely with Henrik. Adding 'intelligence' into a script is rather difficult. But I do have a rather 'painful' IIS ASP application. And frequently IIS will die, hang, stop processing ASP. So I wrote a script that runs on each IIS server, that queries the HTTP status from the monitoring server and if the status is red all dllhost & inetinfo processes are killed & the IIS is started. My fear was that this restart script could be stuck in a loop, i.e. possibility that IIS can not be started. So I added the intelligence into my script that logs the date & time of each time the script restarts IIS. It will only restart, if it has not done it more than 3 times in the last hour. Lately I have been thinking of adding in more logic to check if a restart occured in the last 5 minutes (poling period). Becuase I have seen the restart script do it's job and fix IIS, but the monitoring server has not checked yet. Thus the restart script bounces IIS again...
▸
-----Original Message----- From: Henrik Stoerner [mailto:user-ce4a2c883f75@xymon.invalid] Sent: Friday, July 21, 2006 11:07 AM To: user-ae9b8668bcde@xymon.invalid Subject: Re: [hobbit] bbfix functionality in hobbit On Fri, Jul 21, 2006 at 10:21:40AM -0400, Gary B. wrote:Is there any method with hobbit to have the client automatically > restart services, like bbfix does for BB? We would like to be able to > restart services that tend to fail often (such as SSH tunnels) > automatically through Hobbit. Without writing a custom external > script, I can't seem to find any information about doing this.You will need to do some scripting, no doubt about that. Whether it's a good idea or not ... it depends. From the Hobbit "design" perspective (whoa - that sounds expensive) I have a very firm belief that Hobbit should *monitor* things, not *fix* them. I have seen far too many "intelligent" systems get in the way of real problem-fixing because "intelligent" systems are usually pretty dumb, and cannot handle anything out of the ordinary. When they try, they often fail in spectacular ways. And having things happen behind your back - because you forgot about that little automatic script someone else setup 2 years ago - is just plain frustrating. With that little sermon as introduction, here's what you can do. On the host(s) where you want to restart these services, write a script to query the Hobbit server for the status of the service. If it's red, do the restart. You can use the Hobbit "query" command to tell what status the service has. E.g. if you want to reset the SSH tunnels when the "tunnels" status goes red, then this little script run from the Hobbit client's clientlaunch.cfg would do it: #!/bin/sh TUNNELSTATUS=`$BB $BBDISP "query $MACHINE.tunnels"|awk '{print $1}'` if test "$TUNNELSTATUS" = "red"; then sudo /etc/init.d/sshtunnels stop sleep 5 sudo /etc/init.d/sshtunnels start echo "`date`: SSH tunnels restarted" fi exit 0 Regards, Henrik
list Tom Kauffman
Yup. There's always the case that the process died for a good reason and it's going to die again as soon as it restarts. If you don't set up the restart/recovery routine with a bit of logic to quit after a while you can end up with a box that's spending all its cycles just restarting, dumping, and dying. I learned that one the hard way :-) Tom Kauffman NIBCO, Inc -----Original Message----- From: Aiello, Steve (GE, Corporate, consultant) [mailto:user-49fae449733a@xymon.invalid] Sent: Friday, July 21, 2006 12:11 PM
▸
To: user-ae9b8668bcde@xymon.invalid
Subject: RE: [hobbit] bbfix functionality in hobbit
I agree completely with Henrik. Adding 'intelligence' into a script is
rather difficult. But I do have a rather 'painful' IIS ASP application.
And frequently IIS will die, hang, stop processing ASP. So I wrote a
script that runs on each IIS server, that queries the HTTP status from
the monitoring server and if the status is red all dllhost & inetinfo
processes are killed & the IIS is started. My fear was that this restart
script could be stuck in a loop, i.e. possibility that IIS can not be
started. So I added the intelligence into my script that logs the date &
time of each time the script restarts IIS. It will only restart, if it
has not done it more than 3 times in the last hour. Lately I have been
thinking of adding in more logic to check if a restart occured in the
last 5 minutes (poling period). Becuase I have seen the restart script
do it's job and fix IIS, but the monitoring server has not checked yet.
Thus the restart script bounces IIS again...
-----Original Message----- From: Henrik Stoerner [mailto:user-ce4a2c883f75@xymon.invalid] Sent: Friday, July 21, 2006 11:07 AM To: user-ae9b8668bcde@xymon.invalid Subject: Re: [hobbit] bbfix functionality in hobbit On Fri, Jul 21, 2006 at 10:21:40AM -0400, Gary B. wrote:Is there any method with hobbit to have the client automatically > restart services, like bbfix does for BB? We would like to be able to > restart services that tend to fail often (such as SSH tunnels) > automatically through Hobbit. Without writing a custom external > script, I can't seem to find any information about doing this.You will need to do some scripting, no doubt about that. Whether it's a good idea or not ... it depends. From the Hobbit "design" perspective (whoa - that sounds expensive) I have a very firm belief that Hobbit should *monitor* things, not *fix* them. I have seen far too many "intelligent" systems get in the way of real problem-fixing because "intelligent" systems are usually pretty dumb, and cannot handle anything out of the ordinary. When they try, they often fail in spectacular ways. And having things happen behind your back - because you forgot about that little automatic script someone else setup 2 years ago - is just plain frustrating. With that little sermon as introduction, here's what you can do. On the host(s) where you want to restart these services, write a script to query the Hobbit server for the status of the service. If it's red, do the restart. You can use the Hobbit "query" command to tell what status the service has. E.g. if you want to reset the SSH tunnels when the "tunnels" status goes red, then this little script run from the Hobbit client's clientlaunch.cfg would do it: #!/bin/sh TUNNELSTATUS=`$BB $BBDISP "query $MACHINE.tunnels"|awk '{print $1}'` if test "$TUNNELSTATUS" = "red"; then sudo /etc/init.d/sshtunnels stop sleep 5 sudo /etc/init.d/sshtunnels start echo "`date`: SSH tunnels restarted" fi exit 0 Regards, Henrik
CONFIDENTIALITY NOTICE: This email and any attachments are for the exclusive and confidential use of the intended recipient. If you are not the intended recipient, please do not read, distribute or take action in reliance upon this message. If you have received this in error, please notify us immediately by return email and promptly delete this message and its attachments from your computer system. We do not waive attorney-client or work product privilege by the transmission of this message.