Hobbit main view stopped updating
list Brian Thompson
Hi all, We had a meltdown today with our server that hobbit is monitoring, and once we finally bounced the processes that had shown up as purple (after ensuring they wouldn't bomb out again) hobbit completely froze up. It still refreshes, but it's time is showing the exact time that we bounced the erroneous process. All of the status pages give a "Status not available" page. I've tried rebooting the hobbit server, although I have yet to try rebooting the entire server that's still showing up as offending. Is there some way to reset this? Thanks, Brian Thompson
list Jim Smith
The client on the offending server is running, I trust.
▸
From: Thompson, Brian [mailto:user-1664ec030a9d@xymon.invalid]
Sent: Friday, April 27, 2007 2:19 PM
To: user-ae9b8668bcde@xymon.invalid
Subject: [hobbit] Hobbit main view stopped updating
Hi all,
We had a meltdown today with our server that hobbit is monitoring, and
once we finally bounced the processes that had shown up as purple (after
ensuring they wouldn't bomb out again) hobbit completely froze up. It
still refreshes, but it's time is showing the exact time that we bounced
the erroneous process. All of the status pages give a "Status not
available" page.
I've tried rebooting the hobbit server, although I have yet to try
rebooting the entire server that's still showing up as offending. Is
there some way to reset this?
Thanks,
Brian Thompson
NOTICE: This email contains confidential or proprietary information which may be legally privileged. It is intended only for the named recipient(s). If an addressing error has misdirected the email, please notify the author by replying to this message. If you are not the named recipient, you are not authorized to use, disclose, distribute, copy, print or rely on this email, and should immediately delete it from your computer system.
list Brian Thompson
As far as I'm aware, yes. I'm completely green about all of hobbit still, so far my involvement with it has been very much so on the surface, any issues I've had with hobbit itself have been resolved by a reboot, which I assume would restart the client. I'm sure I'm falling into the "you know what they say what happens when you assume" clause right about now... Anyways, I found this in the /var/log/hobbit/history.log 2007-04-27 11:50:02 Worker process died with exit code 15, terminating That's one minute past the time that's frozen on my hobbit web view. I also found that I had put in an erroneous alert message format in the hobbit-alerts.cfg file. It filled up the page.log file right quick with errors about that, right up until the moment it stopped reporting. I fixed that and tried another restart to no avail. I'm totally lost with this stuff. The joys of hand-me-downs :-/ Thanks, Brian Thompson
▸
From: Smith, Jim [mailto:user-dc30f243a817@xymon.invalid]
Sent: Friday, April 27, 2007 3:21 PM
To: user-ae9b8668bcde@xymon.invalid
Subject: RE: [hobbit] Hobbit main view stopped updating
The client on the offending server is running, I trust.
From: Thompson, Brian [mailto:user-1664ec030a9d@xymon.invalid]
Sent: Friday, April 27, 2007 2:19 PM
To: user-ae9b8668bcde@xymon.invalid
Subject: [hobbit] Hobbit main view stopped updating
Hi all,
We had a meltdown today with our server that hobbit is
monitoring, and once we finally bounced the processes that had shown up
as purple (after ensuring they wouldn't bomb out again) hobbit
completely froze up. It still refreshes, but it's time is showing the
exact time that we bounced the erroneous process. All of the status
pages give a "Status not available" page.
I've tried rebooting the hobbit server, although I have yet to
try rebooting the entire server that's still showing up as offending.
Is there some way to reset this?
Thanks,
Brian Thompson
NOTICE: This email contains confidential or proprietary
information which may be legally privileged. It is intended only for the
named recipient(s). If an addressing error has misdirected the email,
please notify the author by replying to this message. If you are not the
named recipient, you are not authorized to use, disclose, distribute,
copy, print or rely on this email, and should immediately delete it from
your computer system.
list Daniel J McDonald
▸
On Fri, 2007-04-27 at 15:56 -0400, Thompson, Brian wrote:
As far as I'm aware, yes. I'm completely green about all of hobbit still, so far my involvement with it has been very much so on the surface, any issues I've had with hobbit itself have been resolved by a reboot, which I assume would restart the client. I'm sure I'm falling into the "you know what they say what happens when you assume" clause right about now...
The hobbit server process won't restart if it finds a lock file. Try "service hobbit force-start"
-- Daniel J McDonald, CCIE # 2495, CISSP # 78281, CNX Austin Energy http://www.austinenergy.com
list Rich Smrcina
Check your ~/server/tmp directory. If there is a socket file called hobbitd_if there, delete it, then try restarting your server. If Hobbit crashes, this gets me every time. Not that Hobbit crashes that often, but I test it with new client data stream data and back end code for that data. Sometimes it gets a little upset... :)
▸
Thompson, Brian wrote:As far as I'm aware, yes. I'm completely green about all of hobbit still, so far my involvement with it has been very much so on the surface, any issues I've had with hobbit itself have been resolved by a reboot, which I assume would restart the client. I'm sure I'm falling into the "you know what they say what happens when you assume" clause right about now... Anyways, I found this in the /var/log/hobbit/history.log 2007-04-27 11:50:02 Worker process died with exit code 15, terminating That's one minute past the time that's frozen on my hobbit web view. I also found that I had put in an erroneous alert message format in the hobbit-alerts.cfg file. It filled up the page.log file right quick with errors about that, right up until the moment it stopped reporting. I fixed that and tried another restart to no avail. I'm totally lost with this stuff. The joys of hand-me-downs :-/ Thanks, Brian Thompson
--
Rich Smrcina VM Assist, Inc. Phone: XXX-XXX-XXXX Ans Service: XXX-XXX-XXXX user-61add9955ef9@xymon.invalid Catch the WAVV! http://www.wavv.org WAVV 2007 - Green Bay, WI - May 18-22, 2007
list Brian Thompson
I deleted everything in the temp folder to see if something in there could be the issue, and unbeknownst to me I deleted the checkpoint file. I found that out when I was finally able to find a way to start the service. Everything showed up all messed up and I figured it would just take time for it to sort through all of its issues, but I've still got a half dozen no-status yet services, and a boatload of them are still purple. hobbitd has been yellow since I finally got it back up, and I still get the same "Ignored PURPLE status" from all of the custom scripts my overseas counterparts wrote, but now they're accompanied by a large wealth of "Bogus status message contains no data: Sent from hobbitd" messages. Is this something that'll eventually work itself out? It's been up for about 2 hours now and it's still getting plenty of these messages. Thanks, Brian Thompson
▸
-----Original Message-----
From: Rich Smrcina [mailto:user-cf452ff334e0@xymon.invalid]
Sent: Friday, April 27, 2007 6:39 PM
To: user-ae9b8668bcde@xymon.invalid
Subject: Re: [hobbit] Hobbit main view stopped updating
Check your ~/server/tmp directory. If there is a socket file called
hobbitd_if there, delete it, then try restarting your server.
If Hobbit crashes, this gets me every time. Not that Hobbit crashes
that often, but I test it with new client data stream data and back end
code for that data. Sometimes it gets a little upset... :)
Thompson, Brian wrote:As far as I'm aware, yes. I'm completely green about all of hobbit still, so far my involvement with it has been very much so on the surface, any issues I've had with hobbit itself have been resolved by a reboot, which I assume would restart the client. I'm sure I'm falling into the "you know what they say what happens when you assume"
clause right about now... Anyways, I found this in the /var/log/hobbit/history.log 2007-04-27 11:50:02 Worker process died with exit code 15, terminating That's one minute past the time that's frozen on my hobbit web view. I also found that I had put in an erroneous alert message format in the hobbit-alerts.cfg file. It filled up the page.log file right quick with errors about that, right up until the moment it stopped reporting. I fixed that and tried another restart to no avail. I'm totally lost with this stuff. The joys of hand-me-downs :-/ Thanks, Brian Thompson
-- Rich Smrcina VM Assist, Inc. Phone: XXX-XXX-XXXX Ans Service: XXX-XXX-XXXX user-61add9955ef9@xymon.invalid Catch the WAVV! http://www.wavv.org WAVV 2007 - Green Bay, WI - May 18-22, 2007
list Henrik Størner
▸
On Mon, Apr 30, 2007 at 12:19:07PM -0400, Thompson, Brian wrote:
hobbitd has been yellow since I finally got it back up, and I still get the same "Ignored PURPLE status" from all of the custom scripts my overseas counterparts wrote, but now they're accompanied by a large wealth of "Bogus status message contains no data: Sent from hobbitd" messages.
This doesn't sound right. It *might* be related to those purple tests; I think I can see why they would cause it, but I haven't tested it.
▸
Is this something that'll eventually work itself out? It's been up for about 2 hours now and it's still getting plenty of these messages.
I don't think it will stop by itelf. It would be interesting to know if it stops when your clients stop sending the purple status messages. Regards, Henrik
list Brian Thompson
Well I'm now down to one Question Mark status (unchanged in 13634 days) and the rest of the custom scripts are now floating between purple and green. I do now however have very few "Bogus status message" errors from hobbitd. I did not do anything to the scripts, they apparently just worked themselves out. I'll speak with the script owners (and DB owners) to resolve the constant fluctuation of purple statuses, and I'm sure when that last question mark goes away I'll be free of any more bogus status messages. Thanks for the help, Brian Thompson
▸
-----Original Message-----
From: Henrik Stoerner [mailto:user-ce4a2c883f75@xymon.invalid]
Sent: Monday, April 30, 2007 3:59 PM
To: user-ae9b8668bcde@xymon.invalid
Subject: Re: [hobbit] Hobbit main view stopped updating
On Mon, Apr 30, 2007 at 12:19:07PM -0400, Thompson, Brian wrote:hobbitd has been yellow since I finally got it back up, and I still get the same "Ignored PURPLE status" from all of the custom scripts my
overseas counterparts wrote, but now they're accompanied by a large wealth of "Bogus status message contains no data: Sent from hobbitd" messages.
This doesn't sound right. It *might* be related to those purple tests; I think I can see why they would cause it, but I haven't tested it.
Is this something that'll eventually work itself out? It's been up for about 2 hours now and it's still getting plenty of these messages.
I don't think it will stop by itelf. It would be interesting to know if it stops when your clients stop sending the purple status messages. Regards, Henrik