Just wondering if I'm the only one who's done this
list Bruce Ferrell
I've never ever seen it written up, and I've been a user since the days when xymon was BigBrother (yes, I know they aren't common code... xymon IS the spiritual descendant however) Recently, I started monitoring a remote service that was failing regularly. It would send me an alert and I'd go fix the service. I got tired of having to do the restarts so I looked a bit more into alerts.cfg. Yes, I can send an alert via a script (do that all the time for sms)... Wait... Can that script do anything else? Well, I'll be! I wrote one to ssh into the offending system (key based authentication) and perform simple diags, collects relevant logs then restart the downed service. I have seen a number of times, "xymon/bigbrother doesn't restart things". Thoughts? Concerns? Bueller? Bueller??
list Jeremy Ruffer
Good on you. I can see no reason not to if it works for you. Regards Jeremy
▸
On 1 Sep 2016 06:34, "Bruce Ferrell" <user-24fbf1912cfe@xymon.invalid> wrote:
I've never ever seen it written up, and I've been a user since the days when xymon was BigBrother (yes, I know they aren't common code... xymon IS the spiritual descendant however) Recently, I started monitoring a remote service that was failing regularly. It would send me an alert and I'd go fix the service. I got tired of having to do the restarts so I looked a bit more into alerts.cfg. Yes, I can send an alert via a script (do that all the time for sms)... Wait... Can that script do anything else? Well, I'll be! I wrote one to ssh into the offending system (key based authentication) and perform simple diags, collects relevant logs then restart the downed service. I have seen a number of times, "xymon/bigbrother doesn't restart things". Thoughts? Concerns? Bueller? Bueller??
list Henrik Størner
▸
Den 01-09-2016 07:33, Bruce Ferrell skrev:
I've never ever seen it written up, and I've been a user since the days when xymon was BigBrother (yes, I know they aren't common code... xymon IS the spiritual descendant however) Recently, I started monitoring a remote service that was failing regularly. It would send me an alert and I'd go fix the service. I got tired of having to do the restarts so I looked a bit more into alerts.cfg. Yes, I can send an alert via a script (do that all the time for sms)... Wait... Can that script do anything else? Well, I'll be! I wrote one to ssh into the offending system (key based authentication) and perform simple diags, collects relevant logs then restart the downed service. I have seen a number of times, "xymon/bigbrother doesn't restart things". Thoughts? Concerns?
Sure, you can do that if it suits your way of working. And I can certainly see why it would be nice to avoid restarting the same service again and again. The reason that Xymon doesn't do that "out of the box" is this: Xymon has always been a "watch, but don't act" tool. And that is an inheritance from the Big Brother days. Regards, Henrik
list Richard Hamilton
It's a useful enough possibility...and we all use workarounds from time to time. But as time permits, the real cause of the problem should be found and fixed; and workarounds should not become permanent. I'd worry that a monitoring tool being used to implement workarounds could make that too tempting.
▸
On Thu, Sep 1, 2016 at 8:01 AM, Henrik Størner <user-ce4a2c883f75@xymon.invalid> wrote:
Den 01-09-2016 07:33, Bruce Ferrell skrev:I've never ever seen it written up, and I've been a user since the days when xymon was BigBrother (yes, I know they aren't common code... xymon IS the spiritual descendant however) Recently, I started monitoring a remote service that was failing regularly. It would send me an alert and I'd go fix the service. I got tired of having to do the restarts so I looked a bit more into alerts.cfg. Yes, I can send an alert via a script (do that all the time for sms)... Wait... Can that script do anything else? Well, I'll be! I wrote one to ssh into the offending system (key based authentication) and perform simple diags, collects relevant logs then restart the downed service. I have seen a number of times, "xymon/bigbrother doesn't restart things". Thoughts? Concerns?Sure, you can do that if it suits your way of working. And I can certainly see why it would be nice to avoid restarting the same service again and again. The reason that Xymon doesn't do that "out of the box" is this: Xymon has always been a "watch, but don't act" tool. And that is an inheritance from the Big Brother days. Regards, Henrik
list Ryan Novosielski
This is my feeling on this. It’s your business, but no service should need automatic restarting enough to make that an attractive option. -- ____ || \\UTGERS, |---------------------------*O*--------------------------- ||_// the State | Ryan Novosielski - user-46c89e614701@xymon.invalid || \\ University | Sr. Technologist - 973/972.0922 (2x0922) ~*~ RBHS Campus || \\ of NJ | Office of Advanced Research Computing - MSB C630, Newark
▸
`'
On Sep 1, 2016, at 08:42, Richard Hamilton <user-af55987f6d56@xymon.invalid> wrote: It's a useful enough possibility...and we all use workarounds from time to time. But as time permits, the real cause of the problem should be found and fixed; and workarounds should not become permanent. I'd worry that a monitoring tool being used to implement workarounds could make that too tempting. On Thu, Sep 1, 2016 at 8:01 AM, Henrik Størner <user-ce4a2c883f75@xymon.invalid> wrote: Den 01-09-2016 07:33, Bruce Ferrell skrev: I've never ever seen it written up, and I've been a user since the days when xymon was BigBrother (yes, I know they aren't common code... xymon IS the spiritual descendant however) Recently, I started monitoring a remote service that was failing regularly. It would send me an alert and I'd go fix the service. I got tired of having to do the restarts so I looked a bit more into alerts.cfg. Yes, I can send an alert via a script (do that all the time for sms)... Wait... Can that script do anything else? Well, I'll be! I wrote one to ssh into the offending system (key based authentication) and perform simple diags, collects relevant logs then restart the downed service. I have seen a number of times, "xymon/bigbrother doesn't restart things". Thoughts? Concerns? Sure, you can do that if it suits your way of working. And I can certainly see why it would be nice to avoid restarting the same service again and again. The reason that Xymon doesn't do that "out of the box" is this: Xymon has always been a "watch, but don't act" tool. And that is an inheritance from the Big Brother days. Regards, Henrik