Xymon Mailing List Archive search

Hobbit main view stopped updating

8 messages in this thread

list Brian Thompson · Fri, 27 Apr 2007 15:18:40 -0400 ·
Hi all,

We had a meltdown today with our server that hobbit is monitoring, and
once we finally bounced the processes that had shown up as purple (after
ensuring they wouldn't bomb out again) hobbit completely froze up.  It
still refreshes, but it's time is showing the exact time that we bounced
the erroneous process.  All of the status pages give a "Status not
available" page.

I've tried rebooting the hobbit server, although I have yet to try
rebooting the entire server that's still showing up as offending.  Is
there some way to reset this?

Thanks,

Brian Thompson
list Jim Smith · Fri, 27 Apr 2007 14:21:21 -0500 ·
The client on the offending server is running, I trust.
quoted from Brian Thompson

 
From: Thompson, Brian [mailto:user-1664ec030a9d@xymon.invalid] 
Sent: Friday, April 27, 2007 2:19 PM
To: user-ae9b8668bcde@xymon.invalid
Subject: [hobbit] Hobbit main view stopped updating

 
Hi all, 

We had a meltdown today with our server that hobbit is monitoring, and
once we finally bounced the processes that had shown up as purple (after
ensuring they wouldn't bomb out again) hobbit completely froze up.  It
still refreshes, but it's time is showing the exact time that we bounced
the erroneous process.  All of the status pages give a "Status not
available" page.

I've tried rebooting the hobbit server, although I have yet to try
rebooting the entire server that's still showing up as offending.  Is
there some way to reset this?

Thanks, 

Brian Thompson 

NOTICE: This email contains confidential or proprietary information which may be legally privileged. It is intended only for the named recipient(s). If an addressing error has misdirected the email, please notify the author by replying to this message. If you are not the named recipient, you are not authorized to use, disclose, distribute, copy, print or rely on this email, and should immediately delete it from your computer system.
list Brian Thompson · Fri, 27 Apr 2007 15:56:41 -0400 ·
As far as I'm aware, yes.  I'm completely green about all of hobbit
still, so far my involvement with it has been very much so on the
surface, any issues I've had with hobbit itself have been resolved by a
reboot, which I assume would restart the client.  I'm sure I'm falling
into the "you know what they say what happens when you assume" clause
right about now...
 
Anyways, I found this in the /var/log/hobbit/history.log
 
2007-04-27 11:50:02 Worker process died with exit code 15, terminating
 
That's one minute past the time that's frozen on my hobbit web view.
 
I also found that I had put in an erroneous alert message format in the
hobbit-alerts.cfg file.  It filled up the page.log file right quick with
errors about that, right up until the moment it stopped reporting.  I
fixed that and tried another restart to no avail.
 
I'm totally lost with this stuff.  The joys of hand-me-downs :-/
 
Thanks,
 
Brian Thompson
quoted from Jim Smith


	From: Smith, Jim [mailto:user-dc30f243a817@xymon.invalid] 
	Sent: Friday, April 27, 2007 3:21 PM
	To: user-ae9b8668bcde@xymon.invalid
	Subject: RE: [hobbit] Hobbit main view stopped updating
	
	
	The client on the offending server is running, I trust.

	 
	From: Thompson, Brian [mailto:user-1664ec030a9d@xymon.invalid] 
	Sent: Friday, April 27, 2007 2:19 PM
	To: user-ae9b8668bcde@xymon.invalid
	Subject: [hobbit] Hobbit main view stopped updating

	 
	Hi all, 

	We had a meltdown today with our server that hobbit is
monitoring, and once we finally bounced the processes that had shown up
as purple (after ensuring they wouldn't bomb out again) hobbit
completely froze up.  It still refreshes, but it's time is showing the
exact time that we bounced the erroneous process.  All of the status
pages give a "Status not available" page.

	I've tried rebooting the hobbit server, although I have yet to
try rebooting the entire server that's still showing up as offending.
Is there some way to reset this?

	Thanks, 
	
	Brian Thompson 

	NOTICE: This email contains confidential or proprietary
information which may be legally privileged. It is intended only for the
named recipient(s). If an addressing error has misdirected the email,
please notify the author by replying to this message. If you are not the
named recipient, you are not authorized to use, disclose, distribute,
copy, print or rely on this email, and should immediately delete it from
your computer system.
list Daniel J McDonald · Fri, 27 Apr 2007 15:32:03 -0500 ·
quoted from Brian Thompson
On Fri, 2007-04-27 at 15:56 -0400, Thompson, Brian wrote:
As far as I'm aware, yes.  I'm completely green about all of hobbit
still, so far my involvement with it has been very much so on the
surface, any issues I've had with hobbit itself have been resolved by
a reboot, which I assume would restart the client.  I'm sure I'm
falling into the "you know what they say what happens when you assume"
clause right about now...
The hobbit server process won't restart if it finds a lock file.
Try "service hobbit force-start"
        
-- 
Daniel J McDonald, CCIE # 2495, CISSP # 78281, CNX
Austin Energy
http://www.austinenergy.com
list Rich Smrcina · Fri, 27 Apr 2007 17:39:25 -0500 ·
Check your ~/server/tmp directory.  If there is a socket file called hobbitd_if there, delete it, then try restarting your server.

If Hobbit crashes, this gets me every time.  Not that Hobbit crashes that often, but I test it with new client data stream data and back end code for that data.  Sometimes it gets a little upset... :)
quoted from Brian Thompson

Thompson, Brian wrote:
As far as I'm aware, yes.  I'm completely green about all of hobbit still, so far my involvement with it has been very much so on the surface, any issues I've had with hobbit itself have been resolved by a reboot, which I assume would restart the client.  I'm sure I'm falling into the "you know what they say what happens when you assume" clause right about now...
 Anyways, I found this in the /var/log/hobbit/history.log
 2007-04-27 11:50:02 Worker process died with exit code 15, terminating
 That's one minute past the time that's frozen on my hobbit web view.
 I also found that I had put in an erroneous alert message format in the hobbit-alerts.cfg file.  It filled up the page.log file right quick with errors about that, right up until the moment it stopped reporting.  I fixed that and tried another restart to no avail.
 I'm totally lost with this stuff.  The joys of hand-me-downs :-/
 Thanks,
 Brian Thompson
-- 

Rich Smrcina
VM Assist, Inc.
Phone: XXX-XXX-XXXX
Ans Service:  XXX-XXX-XXXX
user-61add9955ef9@xymon.invalid

Catch the WAVV!  http://www.wavv.org
WAVV 2007 - Green Bay, WI - May 18-22, 2007
list Brian Thompson · Mon, 30 Apr 2007 12:19:07 -0400 ·
I deleted everything in the temp folder to see if something in there
could be the issue, and unbeknownst to me I deleted the checkpoint file.

I found that out when I was finally able to find a way to start the
service.  Everything showed up all messed up and I figured it would just
take time for it to sort through all of its issues, but I've still got a
half dozen no-status yet services, and a boatload of them are still
purple.

hobbitd has been yellow since I finally got it back up, and I still get
the same "Ignored PURPLE status" from all of the custom scripts my
overseas counterparts wrote, but now they're accompanied by a large
wealth of "Bogus status message contains no data: Sent from hobbitd"
messages.

Is this something that'll eventually work itself out?  It's been up for
about 2 hours now and it's still getting plenty of these messages.

Thanks,

Brian Thompson
quoted from Rich Smrcina

-----Original Message-----
From: Rich Smrcina [mailto:user-cf452ff334e0@xymon.invalid] 
Sent: Friday, April 27, 2007 6:39 PM
To: user-ae9b8668bcde@xymon.invalid
Subject: Re: [hobbit] Hobbit main view stopped updating

Check your ~/server/tmp directory.  If there is a socket file called
hobbitd_if there, delete it, then try restarting your server.

If Hobbit crashes, this gets me every time.  Not that Hobbit crashes
that often, but I test it with new client data stream data and back end
code for that data.  Sometimes it gets a little upset... :)

Thompson, Brian wrote:
As far as I'm aware, yes.  I'm completely green about all of hobbit 
still, so far my involvement with it has been very much so on the 
surface, any issues I've had with hobbit itself have been resolved by 
a reboot, which I assume would restart the client.  I'm sure I'm 
falling into the "you know what they say what happens when you assume"
clause right about now...
 
Anyways, I found this in the /var/log/hobbit/history.log
 
2007-04-27 11:50:02 Worker process died with exit code 15, terminating
 
That's one minute past the time that's frozen on my hobbit web view.
 
I also found that I had put in an erroneous alert message format in 
the hobbit-alerts.cfg file.  It filled up the page.log file right 
quick with errors about that, right up until the moment it stopped 
reporting.  I fixed that and tried another restart to no avail.
 
I'm totally lost with this stuff.  The joys of hand-me-downs :-/
 
Thanks,
 
Brian Thompson
--
Rich Smrcina
VM Assist, Inc.
Phone: XXX-XXX-XXXX
Ans Service:  XXX-XXX-XXXX
user-61add9955ef9@xymon.invalid

Catch the WAVV!  http://www.wavv.org
WAVV 2007 - Green Bay, WI - May 18-22, 2007
list Henrik Størner · Mon, 30 Apr 2007 21:58:51 +0200 ·
quoted from Brian Thompson
On Mon, Apr 30, 2007 at 12:19:07PM -0400, Thompson, Brian wrote:
hobbitd has been yellow since I finally got it back up, and I still get
the same "Ignored PURPLE status" from all of the custom scripts my
overseas counterparts wrote, but now they're accompanied by a large
wealth of "Bogus status message contains no data: Sent from hobbitd"
messages.
This doesn't sound right. It *might* be related to those purple tests;
I think I can see why they would cause it, but I haven't tested it.
quoted from Brian Thompson

Is this something that'll eventually work itself out?  It's been up for
about 2 hours now and it's still getting plenty of these messages.
I don't think it will stop by itelf.  It would be interesting to know if it 
stops when your clients stop sending the purple status messages.


Regards,
Henrik
list Brian Thompson · Tue, 1 May 2007 10:50:04 -0400 ·
Well I'm now down to one Question Mark status (unchanged in 13634 days)
and the rest of the custom scripts are now floating between purple and
green.  I do now however have very few "Bogus status message" errors
from hobbitd.  I did not do anything to the scripts, they apparently
just worked themselves out.

I'll speak with the script owners (and DB owners) to resolve the
constant fluctuation of purple statuses, and I'm sure when that last
question mark goes away I'll be free of any more bogus status messages.

Thanks for the help,

Brian Thompson
quoted from Henrik Størner

-----Original Message-----
From: Henrik Stoerner [mailto:user-ce4a2c883f75@xymon.invalid] 
Sent: Monday, April 30, 2007 3:59 PM
To: user-ae9b8668bcde@xymon.invalid
Subject: Re: [hobbit] Hobbit main view stopped updating

On Mon, Apr 30, 2007 at 12:19:07PM -0400, Thompson, Brian wrote:
hobbitd has been yellow since I finally got it back up, and I still 
get the same "Ignored PURPLE status" from all of the custom scripts my
overseas counterparts wrote, but now they're accompanied by a large 
wealth of "Bogus status message contains no data: Sent from hobbitd"
messages.
This doesn't sound right. It *might* be related to those purple tests; I
think I can see why they would cause it, but I haven't tested it.

Is this something that'll eventually work itself out?  It's been up 
for about 2 hours now and it's still getting plenty of these messages.
I don't think it will stop by itelf.  It would be interesting to know if
it stops when your clients stop sending the purple status messages.


Regards,
Henrik