Xymon Mailing List Archive search

bbfix functionality in hobbit

11 messages in this thread

list Gary B. · Fri, 21 Jul 2006 10:21:40 -0400 ·
Is there any method with hobbit to have the client automatically
restart services, like bbfix does for BB?  We would like to be able to
restart services that tend to fail often (such as SSH tunnels)
automatically through Hobbit.  Without writing a custom external
script, I can't seem to find any information about doing this.


Thanks
list Jim Smith · Fri, 21 Jul 2006 09:25:43 -0500 ·
My personal opinion is that having Hobbit (or Big Brother) take action
other than sending an alert goes beyond the scope of what it is intended
for...as a monitoring system.  I had thought about the same thing at one
point, but was talked out of it.
quoted from Gary B.


-----Original Message-----
From: Gary B. [mailto:user-33b796116d5f@xymon.invalid] 
Sent: Friday, July 21, 2006 9:22 AM
To: user-ae9b8668bcde@xymon.invalid
Subject: [hobbit] bbfix functionality in hobbit

Is there any method with hobbit to have the client automatically
restart services, like bbfix does for BB?  We would like to be able to
restart services that tend to fail often (such as SSH tunnels)
automatically through Hobbit.  Without writing a custom external
script, I can't seem to find any information about doing this.


Thanks


NOTICE: This email contains confidential or proprietary information which may be legally privileged. It is intended only for the named recipient(s). If an addressing error has misdirected the email, please notify the author by replying to this message. If you are not the named recipient, you are not authorized to use, disclose, distribute, copy, print or rely on this email, and should immediately delete it from your computer system.
list Larry Barber · Fri, 21 Jul 2006 09:32:23 -0500 ·
You could just put the program into clientlaunch.cfg. Hobbit restarts those
programs automatically when they die. Whether or not this is a good idea
I'll leave for you to decide.

Thanks,
Larry Barber
quoted from Gary B.

On 7/21/06, Gary B. <user-33b796116d5f@xymon.invalid> wrote:
Is there any method with hobbit to have the client automatically
restart services, like bbfix does for BB?  We would like to be able to
restart services that tend to fail often (such as SSH tunnels)
automatically through Hobbit.  Without writing a custom external
script, I can't seem to find any information about doing this.


Thanks

list Gary B. · Fri, 21 Jul 2006 10:32:33 -0400 ·
quoted from Jim Smith
My personal opinion is that having Hobbit (or Big Brother) take action
other than sending an alert goes beyond the scope of what it is intended
for...as a monitoring system.  I had thought about the same thing at one
point, but was talked out of it.
I would tend to agree myself, except 1) some people are asking about
such functionality, and 2) it would be nice during oncall weeks to not
have to get up and restart processes such as procallator, etc.

I would be interested in hearing what talked you out of this, however.
 If nothing else, I can use it to talk other people out of it ;-)
quoted from Gary B.
-----Original Message-----
From: Gary B. [mailto:user-33b796116d5f@xymon.invalid]
Sent: Friday, July 21, 2006 9:22 AM
To: user-ae9b8668bcde@xymon.invalid
Subject: [hobbit] bbfix functionality in hobbit

Is there any method with hobbit to have the client automatically
restart services, like bbfix does for BB?  We would like to be able to
restart services that tend to fail often (such as SSH tunnels)
automatically through Hobbit.  Without writing a custom external
script, I can't seem to find any information about doing this.


Thanks


NOTICE: This email contains confidential or proprietary information which may be legally privileged. It is intended only for the named recipient(s). If an addressing error has misdirected the email, please notify the author by replying to this message. If you are not the named recipient, you are not authorized to use, disclose, distribute, copy, print or rely on this email, and should immediately delete it from your computer system.

list Nicolas Dorfsman · Fri, 21 Jul 2006 16:40:13 +0200 ·
	What about using SCRIPT facility of alert module ?

If e-mail is not enough

The MAIL keyword means that the alert is sent in an e-mail. Sometimes  this ends up being an SMS to your cell-phone - there are several "e- mail to SMS" gateways that perform this service - but that may not be  what you want to do. And also, for an e-mail to actually be delivered  requires that the mail-server is working. So if you need full control  over how alerts are handled, you can use the SCRIPT method instead.  Here's how:

	HOST=%(www|intranet|support|mail).foo.com SERVICE=http
		SCRIPT /usr/local/bin/smsalert 4538761925 FORMAT=sms
list Jim Smith · Fri, 21 Jul 2006 09:43:40 -0500 ·
Basically, it was just the scope argument.  And I know all about on-call
weeks...my one backup-analyst and I rotate weeks so I'm on-call half the
year!  But my first focus would be to attack the issue on its home
server.  For instance, if it is a Unix server I would investigate
creating some kind of script that could be put in /etc/inittab to
restart a dead process.

If all else fails, however, you gotta do what you gotta do to get some
sleep at night!
quoted from Gary B.

-----Original Message-----
From: Gary B. [mailto:user-33b796116d5f@xymon.invalid] 
Sent: Friday, July 21, 2006 9:33 AM
To: user-ae9b8668bcde@xymon.invalid
Subject: Re: [hobbit] bbfix functionality in hobbit
My personal opinion is that having Hobbit (or Big Brother) take action
other than sending an alert goes beyond the scope of what it is
intended
for...as a monitoring system.  I had thought about the same thing at
one
point, but was talked out of it.
I would tend to agree myself, except 1) some people are asking about
such functionality, and 2) it would be nice during oncall weeks to not
have to get up and restart processes such as procallator, etc.

I would be interested in hearing what talked you out of this, however.
 If nothing else, I can use it to talk other people out of it ;-)
-----Original Message-----
From: Gary B. [mailto:user-33b796116d5f@xymon.invalid]
Sent: Friday, July 21, 2006 9:22 AM
To: user-ae9b8668bcde@xymon.invalid
Subject: [hobbit] bbfix functionality in hobbit

Is there any method with hobbit to have the client automatically
restart services, like bbfix does for BB?  We would like to be able to
restart services that tend to fail often (such as SSH tunnels)
automatically through Hobbit.  Without writing a custom external
script, I can't seem to find any information about doing this.


Thanks


NOTICE: This email contains confidential or proprietary information
which may be legally privileged. It is intended only for the named
recipient(s). If an addressing error has misdirected the email, please
notify the author by replying to this message. If you are not the named
recipient, you are not authorized to use, disclose, distribute, copy,
print or rely on this email, and should immediately delete it from your
computer system.

NOTICE: This email contains confidential or proprietary information which may be legally privileged. It is intended only for the named recipient(s). If an addressing error has misdirected the email, please notify the author by replying to this message. If you are not the named recipient, you are not authorized to use, disclose, distribute, copy, print or rely on this email, and should immediately delete it from your computer system.
list Tom Georgoulias · Fri, 21 Jul 2006 10:52:00 -0400 ·
quoted from Gary B.
Gary B. wrote:
My personal opinion is that having Hobbit (or Big Brother) take action
other than sending an alert goes beyond the scope of what it is intended
for...as a monitoring system.  I had thought about the same thing at one
point, but was talked out of it.
I would tend to agree myself, except 1) some people are asking about
such functionality, and 2) it would be nice during oncall weeks to not
have to get up and restart processes such as procallator, etc.

I would be interested in hearing what talked you out of this, however.
If nothing else, I can use it to talk other people out of it ;-)
I've thought about this kind of functionality a lot and my co-workers and I have discussed it several times before, and there are definitely pros and cons.  However, it would be cool to have a module (kinda like the NCV one that gives end users a lot of flexibility) that allowed Hobbit to execute some script/command when an alert is triggered. Whether that script/command then collects more debugging info about the alert, tries to repair the system, or opens a ticket with a ticketing system automatically would then be up to the end user.  That might keep Henrik out of a lot of scope creep.  :)

Tom
list Henrik Størner · Fri, 21 Jul 2006 17:07:16 +0200 ·
quoted from Gary B.
On Fri, Jul 21, 2006 at 10:21:40AM -0400, Gary B. wrote:
Is there any method with hobbit to have the client automatically
restart services, like bbfix does for BB?  We would like to be able to
restart services that tend to fail often (such as SSH tunnels)
automatically through Hobbit.  Without writing a custom external
script, I can't seem to find any information about doing this.
You will need to do some scripting, no doubt about that.

Whether it's a good idea or not ... it depends. From the Hobbit
"design" perspective (whoa - that sounds expensive) I have a very
firm belief that Hobbit should *monitor* things, not *fix* them.
I have seen far too many "intelligent" systems get in the way of
real problem-fixing because "intelligent" systems are usually
pretty dumb, and cannot handle anything out of the ordinary. When
they try, they often fail in spectacular ways.

And having things happen behind your back - because you forgot about
that little automatic script someone else setup 2 years ago - is just
plain frustrating.

With that little sermon as introduction, here's what you can do.
On the host(s) where you want to restart these services, write a
script to query the Hobbit server for the status of the service. If it's
red, do the restart. You can use the Hobbit "query" command to tell
what status the service has. E.g. if you want to reset the SSH tunnels
when the "tunnels" status goes red, then this little script run from
the Hobbit client's clientlaunch.cfg would do it:

   #!/bin/sh

   TUNNELSTATUS=`$BB $BBDISP "query $MACHINE.tunnels"|awk '{print $1}'`
   if test "$TUNNELSTATUS" = "red"; then
      sudo /etc/init.d/sshtunnels stop
      sleep 5
      sudo /etc/init.d/sshtunnels start
      echo "`date`: SSH tunnels restarted"
   fi

   exit 0


Regards,
Henrik
list Gary B. · Fri, 21 Jul 2006 11:29:36 -0400 ·
quoted from Henrik Størner
Is there any method with hobbit to have the client automatically
restart services, like bbfix does for BB?  We would like to be able to
restart services that tend to fail often (such as SSH tunnels)
automatically through Hobbit.  Without writing a custom external
script, I can't seem to find any information about doing this.
Thanks all for your responses.  It has provided me with a lot more
information that I could have hoped ;-)
list Steve Aiello · Fri, 21 Jul 2006 12:11:19 -0400 ·
I agree completely with Henrik. Adding 'intelligence' into a script is
rather difficult. But I do have a rather 'painful' IIS ASP application.
And frequently IIS will die, hang, stop processing ASP. So I wrote a
script that runs on each IIS server, that queries the HTTP status from
the monitoring server and if the status is red all dllhost & inetinfo
processes are killed & the IIS is started. My fear was that this restart
script could be stuck in a loop, i.e. possibility that IIS can not be
started. So I added the intelligence into my script that logs the date &
time of each time the script restarts IIS. It will only restart, if it
has not done it more than 3 times in the last hour. Lately I have been
thinking of adding in more logic to check if a restart occured in the
last 5 minutes (poling period). Becuase I have seen the restart script
do it's job and fix IIS, but the monitoring server has not checked yet.
Thus the restart script bounces IIS again...
quoted from Henrik Størner
-----Original Message-----
From: Henrik Stoerner [mailto:user-ce4a2c883f75@xymon.invalid] Sent: Friday, July 21, 2006 11:07 AM
To: user-ae9b8668bcde@xymon.invalid
Subject: Re: [hobbit] bbfix functionality in hobbit


On Fri, Jul 21, 2006 at 10:21:40AM -0400, Gary B. wrote:
Is there any method with hobbit to have the client automatically > restart services, like bbfix does for BB?  We would like to be able to > restart services that tend to fail often (such as SSH tunnels) > automatically through Hobbit.  Without writing a custom external > script, I can't seem to find any information about doing this.
You will need to do some scripting, no doubt about that.

Whether it's a good idea or not ... it depends. From the Hobbit "design" perspective (whoa - that sounds expensive) I have a very firm belief that Hobbit should *monitor* things, not *fix* them. I have seen far too many "intelligent" systems get in the way of real problem-fixing because "intelligent" systems are usually pretty dumb, and cannot handle anything out of the ordinary. When they try, they often fail in spectacular ways.

And having things happen behind your back - because you forgot about that little automatic script someone else setup 2 years ago - is just plain frustrating.

With that little sermon as introduction, here's what you can do. On the host(s) where you want to restart these services, write a script to query the Hobbit server for the status of the service. If it's red, do the restart. You can use the Hobbit "query" command to tell what status the service has. E.g. if you want to reset the SSH tunnels when the "tunnels" status goes red, then this little script run from the Hobbit client's clientlaunch.cfg would do it:

   #!/bin/sh

   TUNNELSTATUS=`$BB $BBDISP "query $MACHINE.tunnels"|awk '{print $1}'`
   if test "$TUNNELSTATUS" = "red"; then
      sudo /etc/init.d/sshtunnels stop
      sleep 5
      sudo /etc/init.d/sshtunnels start
      echo "`date`: SSH tunnels restarted"
   fi

   exit 0


Regards,
Henrik

list Tom Kauffman · Fri, 21 Jul 2006 13:25:33 -0400 ·
Yup.

There's always the case that the process died for a good reason and it's
going to die again as soon as it restarts. If you don't set up the
restart/recovery routine with a bit of logic to quit after a while you
can end up with a box that's spending all its cycles just restarting,
dumping, and dying.

I learned that one the hard way :-)

Tom Kauffman
NIBCO, Inc

-----Original Message-----
From: Aiello, Steve (GE, Corporate, consultant)
[mailto:user-49fae449733a@xymon.invalid] Sent: Friday, July 21, 2006 12:11 PM
quoted from Steve Aiello
To: user-ae9b8668bcde@xymon.invalid
Subject: RE: [hobbit] bbfix functionality in hobbit

I agree completely with Henrik. Adding 'intelligence' into a script is
rather difficult. But I do have a rather 'painful' IIS ASP application.
And frequently IIS will die, hang, stop processing ASP. So I wrote a
script that runs on each IIS server, that queries the HTTP status from
the monitoring server and if the status is red all dllhost & inetinfo
processes are killed & the IIS is started. My fear was that this restart
script could be stuck in a loop, i.e. possibility that IIS can not be
started. So I added the intelligence into my script that logs the date &
time of each time the script restarts IIS. It will only restart, if it
has not done it more than 3 times in the last hour. Lately I have been
thinking of adding in more logic to check if a restart occured in the
last 5 minutes (poling period). Becuase I have seen the restart script
do it's job and fix IIS, but the monitoring server has not checked yet.
Thus the restart script bounces IIS again...
-----Original Message-----
From: Henrik Stoerner [mailto:user-ce4a2c883f75@xymon.invalid] Sent: Friday, July 21, 2006 11:07 AM
To: user-ae9b8668bcde@xymon.invalid
Subject: Re: [hobbit] bbfix functionality in hobbit


On Fri, Jul 21, 2006 at 10:21:40AM -0400, Gary B. wrote:
Is there any method with hobbit to have the client automatically > restart services, like bbfix does for BB?  We would like to be able to > restart services that tend to fail often (such as SSH tunnels) > automatically through Hobbit.  Without writing a custom external > script, I can't seem to find any information about doing this.
You will need to do some scripting, no doubt about that.

Whether it's a good idea or not ... it depends. From the Hobbit "design" perspective (whoa - that sounds expensive) I have a very firm belief that Hobbit should *monitor* things, not *fix* them. I have seen far too many "intelligent" systems get in the way of real problem-fixing because "intelligent" systems are usually pretty dumb, and cannot handle anything out of the ordinary. When they try, they often fail in spectacular ways.

And having things happen behind your back - because you forgot about that little automatic script someone else setup 2 years ago - is just plain frustrating.

With that little sermon as introduction, here's what you can do. On the host(s) where you want to restart these services, write a script to query the Hobbit server for the status of the service. If it's red, do the restart. You can use the Hobbit "query" command to tell what status the service has. E.g. if you want to reset the SSH tunnels when the "tunnels" status goes red, then this little script run from the Hobbit client's clientlaunch.cfg would do it:

   #!/bin/sh

   TUNNELSTATUS=`$BB $BBDISP "query $MACHINE.tunnels"|awk '{print $1}'`
   if test "$TUNNELSTATUS" = "red"; then
      sudo /etc/init.d/sshtunnels stop
      sleep 5
      sudo /etc/init.d/sshtunnels start
      echo "`date`: SSH tunnels restarted"
   fi

   exit 0


Regards,
Henrik

CONFIDENTIALITY NOTICE:  This email and any attachments are for the exclusive and confidential use of the intended recipient.  If you are not
the intended recipient, please do not read, distribute or take action in reliance upon this message. If you have received this in error, please notify us immediately by return email and promptly delete this message and its attachments from your computer system. We do not waive  attorney-client or work product privilege by the transmission of this
message.