Status Unavailable
list Vernon Everett
Hi all Here's an interesting one. I have been running Hobbit for a few months now, and everything has been good. Upgraded to 4.0.3, and all was well. The suddenly, a few days ago, I started getting intermittant errors when I click on the faces/blobs for more information. Sometimes it works, and I get what I want. Other times, it times out and I get "Status not available" on a white screen. I tried the following 1. Restarting Hobbit. 2. Restarting the web server 3. Rebooting the server 4. Upgrading to Hobbit 4.0.4 Still no improvement. Looking again, I can rule out the Web server, because I get the main page, and the column info links on the column description works. I think I can rule out the network, because all the monitored machines are reachable and pingable from Hobbit. I think I ruled out a bug by upgrading to 4.0.4 I am still getting e-mail alerts (some of the time) and graphs are being updated (partially). When it does the "Status not available" trick for long periods of time, it doesn't upgrade the status or the graphs during that period. Any ideas, pointers or tips? Regards Vernon _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ NOTICE: This message and any attachments are confidential and may contain copyright material of Australian Finance Group Limited or a third party. It is intended solely for the purpose of the addressee and any other named recipient. If you are not the intended recipient, any use, distribution, disclosure or copying of this message is strictly prohibited. The confidentiality attached to this message is not waived or lost by reason of the mistaken transmission or delivery to any unintended party. If you have received this message in error, please notify the author immediately or contact Australian Finance Group on +61 8 9420 7888.
list Vernon Everett
Something else I noticed. Sometimes, the main screen gives me the header and the footer, and no info. No blobs/smileys, no columns, no server names, nothing. Just the menu, the logo, "CURRENT STATUS" and the date at the top, and a line. Then the footer. Line and Hobbit version. It is also on a green background, even though I have at least one test that is reporting yellow. Regards Vernon
▸
-----Original Message-----
From: Vernon Everett [mailto:user-99fc6b22a3a3@xymon.invalid]
Sent: Friday, 1 July 2005 9:04 AM
To: user-ae9b8668bcde@xymon.invalid
Subject: [hobbit] Status Unavailable
Hi all
Here's an interesting one.
I have been running Hobbit for a few months now, and everything has been
good.
Upgraded to 4.0.3, and all was well.
The suddenly, a few days ago, I started getting intermittant errors when
I click on the faces/blobs for more information.
Sometimes it works, and I get what I want.
Other times, it times out and I get "Status not available" on a white
screen.
I tried the following
1. Restarting Hobbit.
2. Restarting the web server
3. Rebooting the server
4. Upgrading to Hobbit 4.0.4
Still no improvement.
Looking again, I can rule out the Web server, because I get the main
page, and the column info links on the column description works.
I think I can rule out the network, because all the monitored machines
are reachable and pingable from Hobbit.
I think I ruled out a bug by upgrading to 4.0.4
I am still getting e-mail alerts (some of the time) and graphs are being
updated (partially).
When it does the "Status not available" trick for long periods of time,
it doesn't upgrade the status or the graphs during that period.
Any ideas, pointers or tips?
Regards
Vernon
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
_ _
NOTICE: This message and any attachments are confidential and may
contain copyright material of Australian Finance Group Limited or a
third party. It is intended solely for the purpose of the addressee and
any other named recipient. If you are not the intended recipient, any
use, distribution, disclosure or copying of this message is strictly
prohibited. The confidentiality attached to this message is not waived
or lost by reason of the mistaken transmission or delivery to any
unintended party. If you have received this message in error, please
notify the author immediately or contact Australian Finance Group on +61
8 9420 7888.
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
NOTICE: This message and any attachments are confidential and may contain copyright material
of Australian Finance Group Limited or a third party. It is intended solely for the purpose of the
addressee and any other named recipient. If you are not the intended recipient, any use,
distribution, disclosure or copying of this message is strictly prohibited. The confidentiality attached
to this message is not waived or lost by reason of the mistaken transmission or delivery to any
unintended party. If you have received this message in error, please notify the author immediately or
contact Australian Finance Group on +61 8 9420 7888.
list Vernon Everett
Also found this in the hobbitlaunch.log Don't understand it much, but it might help you guys. 2005-07-01 09:35:10 Setting up hobbitd channels 2005-07-01 09:35:10 Setting up logfiles 2005-07-01 09:35:10 Task hobbitd started with PID 6563 2005-07-01 09:35:15 Task bbhistory started with PID 6567 2005-07-01 09:35:15 Task bbenadis started with PID 6569 2005-07-01 09:35:15 Task bbpage started with PID 6571 2005-07-01 09:35:15 Task larrdstatus started with PID 6572 2005-07-01 09:35:15 Task larrddata started with PID 6573 2005-07-01 09:36:05 Task bbdisplay started with PID 6621 2005-07-01 09:36:05 Task bbretest started with PID 6622 2005-07-01 09:37:05 Task bbdisplay started with PID 6660 2005-07-01 09:37:05 Task bbcombotest started with PID 6661 2005-07-01 09:37:05 Task bbnet started with PID 6662 2005-07-01 09:37:05 Task bbretest started with PID 6663 2005-07-01 09:38:05 Task bbdisplay started with PID 6701 2005-07-01 09:38:05 Task bbretest started with PID 6702 2005-07-01 09:38:10 Heartbeat lost for task hobbitd, bouncing it 2005-07-01 09:38:15 Heartbeat lost for task hobbitd, killing it 2005-07-01 09:38:15 Task hobbitd terminated by signal 9 2005-07-01 09:38:15 Loading hostnames 2005-07-01 09:38:15 Loading saved state 2005-07-01 09:38:15 Task hobbitd started with PID 6709 2005-07-01 09:38:15 Setting up network listener on 0.0.0.0:1984 2005-07-01 09:38:15 Setting up signal handlers 2005-07-01 09:38:15 Setting up hobbitd channels 2005-07-01 09:38:15 Setting up logfiles 2005-07-01 09:38:20 Task bbhistory started with PID 6713 2005-07-01 09:38:20 Task bbenadis started with PID 6715 2005-07-01 09:38:20 Task bbpage started with PID 6716 2005-07-01 09:38:20 Task larrdstatus started with PID 6717 2005-07-01 09:38:20 Task larrddata started with PID 6718 2005-07-01 09:39:05 Task bbdisplay started with PID 6750 2005-07-01 09:39:05 Task bbretest started with PID 6751 2005-07-01 09:40:07 Task bbdisplay started with PID 6814 2005-07-01 09:40:07 Task bbretest started with PID 6815 2005-07-01 09:41:08 Task bbdisplay started with PID 6852 2005-07-01 09:41:08 Task bbretest started with PID 6853 2005-07-01 09:42:08 Task bbdisplay started with PID 6890 2005-07-01 09:42:08 Task bbcombotest started with PID 6891 2005-07-01 09:42:08 Task bbnet started with PID 6892 2005-07-01 09:42:08 Task bbretest started with PID 6893 2005-07-01 09:42:13 Heartbeat lost for task hobbitd, bouncing it 2005-07-01 09:42:18 Heartbeat lost for task hobbitd, killing it 2005-07-01 09:42:18 Task hobbitd terminated by signal 9 2005-07-01 09:42:18 Task bbdisplay terminated by signal 15 2005-07-01 09:42:18 Task bbnet terminated by signal 15 2005-07-01 09:48:18 Loading hostnames 2005-07-01 09:48:18 Loading saved state 2005-07-01 09:48:18 Setting up network listener on 0.0.0.0:1984 2005-07-01 09:48:18 Setting up signal handlers 2005-07-01 09:48:18 Setting up hobbitd channels 2005-07-01 09:48:18 Setting up logfiles 2005-07-01 09:48:18 Task hobbitd started with PID 7166 2005-07-01 09:48:23 Task bbhistory started with PID 7170 2005-07-01 09:48:23 Task bbenadis started with PID 7172 2005-07-01 09:48:23 Task bbpage started with PID 7174 2005-07-01 09:48:23 Task larrdstatus started with PID 7175 2005-07-01 09:48:23 Task larrddata started with PID 7176 2005-07-01 09:48:23 Task bbdisplay started with PID 7177 2005-07-01 09:48:23 Task bbcombotest started with PID 7178 2005-07-01 09:48:23 Task bbnet started with PID 7179 2005-07-01 09:48:23 Task bbretest started with PID 7180 2005-07-01 09:48:23 Task larrdcolumn started with PID 7184 2005-07-01 09:48:23 Task infocolumn started with PID 7186 2005-07-01 09:49:27 Task bbdisplay started with PID 7242 2005-07-01 09:49:27 Task bbretest started with PID 7243 2005-07-01 09:50:27 Task bbdisplay started with PID 7293 2005-07-01 09:50:27 Task bbretest started with PID 7294 2005-07-01 09:51:22 Heartbeat lost for task hobbitd, bouncing it 2005-07-01 09:51:27 Heartbeat lost for task hobbitd, killing it 2005-07-01 09:51:27 Task bbdisplay started with PID 7331 2005-07-01 09:51:27 Task bbretest started with PID 7332 2005-07-01 09:51:27 Task hobbitd terminated by signal 9 2005-07-01 09:51:27 Loading hostnames 2005-07-01 09:51:27 Loading saved state 2005-07-01 09:51:27 Setting up network listener on 0.0.0.0:1984 2005-07-01 09:51:27 Setting up signal handlers 2005-07-01 09:51:27 Setting up hobbitd channels 2005-07-01 09:51:27 FATAL: hobbitd sees clientcount 1, should be 0 Check for hanging hobbitd_channel processes or stale semaphores 2005-07-01 09:51:27 Cannot setup status channel 2005-07-01 09:51:27 Task hobbitd started with PID 7333 2005-07-01 09:51:27 Task bbdisplay terminated by signal 15 2005-07-01 09:51:27 Task hobbitd terminated, status 1 2005-07-01 09:51:32 Loading hostnames 2005-07-01 09:51:32 Loading saved state 2005-07-01 09:51:32 Setting up network listener on 0.0.0.0:1984 2005-07-01 09:51:32 Setting up signal handlers 2005-07-01 09:51:32 Setting up hobbitd channels 2005-07-01 09:51:32 Setting up logfiles 2005-07-01 09:51:32 Task hobbitd started with PID 7337 2005-07-01 09:51:37 Task bbhistory started with PID 7341 2005-07-01 09:51:37 Task bbenadis started with PID 7343 2005-07-01 09:51:37 Task bbpage started with PID 7345 2005-07-01 09:51:37 Task larrdstatus started with PID 7346 2005-07-01 09:51:37 Task larrddata started with PID 7347 2005-07-01 09:52:30 Task bbdisplay started with PID 7411 2005-07-01 09:52:30 Task bbretest started with PID 7412 2005-07-01 09:53:24 Task bbcombotest started with PID 7457 2005-07-01 09:53:24 Task bbnet started with PID 7458
▸
-----Original Message-----
From: Vernon Everett [mailto:user-99fc6b22a3a3@xymon.invalid]
Sent: Friday, 1 July 2005 9:04 AM
To: user-ae9b8668bcde@xymon.invalid
Subject: [hobbit] Status Unavailable
Hi all
Here's an interesting one.
I have been running Hobbit for a few months now, and everything has been
good.
Upgraded to 4.0.3, and all was well.
The suddenly, a few days ago, I started getting intermittant errors when
I click on the faces/blobs for more information.
Sometimes it works, and I get what I want.
Other times, it times out and I get "Status not available" on a white
screen.
I tried the following
1. Restarting Hobbit.
2. Restarting the web server
3. Rebooting the server
4. Upgrading to Hobbit 4.0.4
Still no improvement.
Looking again, I can rule out the Web server, because I get the main
page, and the column info links on the column description works.
I think I can rule out the network, because all the monitored machines
are reachable and pingable from Hobbit.
I think I ruled out a bug by upgrading to 4.0.4
I am still getting e-mail alerts (some of the time) and graphs are being
updated (partially).
When it does the "Status not available" trick for long periods of time,
it doesn't upgrade the status or the graphs during that period.
Any ideas, pointers or tips?
Regards
Vernon
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
_ _
NOTICE: This message and any attachments are confidential and may
contain copyright material of Australian Finance Group Limited or a
third party. It is intended solely for the purpose of the addressee and
any other named recipient. If you are not the intended recipient, any
use, distribution, disclosure or copying of this message is strictly
prohibited. The confidentiality attached to this message is not waived
or lost by reason of the mistaken transmission or delivery to any
unintended party. If you have received this message in error, please
notify the author immediately or contact Australian Finance Group on +61
8 9420 7888.
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
NOTICE: This message and any attachments are confidential and may contain copyright material
of Australian Finance Group Limited or a third party. It is intended solely for the purpose of the
addressee and any other named recipient. If you are not the intended recipient, any use,
distribution, disclosure or copying of this message is strictly prohibited. The confidentiality attached
to this message is not waived or lost by reason of the mistaken transmission or delivery to any
unintended party. If you have received this message in error, please notify the author immediately or
contact Australian Finance Group on +61 8 9420 7888.
list Vernon Everett
Hi Henrik Been watching this carefully, and seeing hobbitd test go yellow all the time. Is there a list of what all the numbers mean, and what they should be in the hobbitd results? Mine looked like this, which is all very nice, but what so they mean? Statistics for Hobbit daemon Up since 01-Jul-2005 07:31:02 (0 days, 00:05:01) Incoming messages : 145 - status : 122 - combo : 18 - page : 0 - summary : 0 - data : 0 - notes : 0 - enable : 0 - disable : 0 - ack : 0 - config : 0 - query : 0 - hobbitdboard : 5 - hobbitdlog : 0 - drop : 0 - rename : 0 - dummy : 0 - notify : 0 - schedule : 0 - Bogus/Timeouts : 0 Incoming messages/sec : 0 (average last 301 seconds) status channel messages: 120 (1 readers) stachg channel messages: 0 (1 readers) page channel messages: 1 (1 readers) data channel messages: 0 (1 readers) notes channel messages: 0 (0 readers) enadis channel messages: 0 (1 readers) Latest errormessages: Loading hostnames Loading saved state Setting up network listener on 0.0.0.0:1984 Setting up signal handlers Setting up hobbitd channels Setting up logfiles Setup complete
▸
-----Original Message-----
From: Vernon Everett [mailto:user-99fc6b22a3a3@xymon.invalid]
Sent: Friday, 1 July 2005 9:04 AM
To: user-ae9b8668bcde@xymon.invalid
Subject: [hobbit] Status Unavailable
Hi all
Here's an interesting one.
I have been running Hobbit for a few months now, and everything has been
good.
Upgraded to 4.0.3, and all was well.
The suddenly, a few days ago, I started getting intermittant errors when
I click on the faces/blobs for more information.
Sometimes it works, and I get what I want.
Other times, it times out and I get "Status not available" on a white
screen.
I tried the following
1. Restarting Hobbit.
2. Restarting the web server
3. Rebooting the server
4. Upgrading to Hobbit 4.0.4
Still no improvement.
Looking again, I can rule out the Web server, because I get the main
page, and the column info links on the column description works.
I think I can rule out the network, because all the monitored machines
are reachable and pingable from Hobbit.
I think I ruled out a bug by upgrading to 4.0.4
I am still getting e-mail alerts (some of the time) and graphs are being
updated (partially).
When it does the "Status not available" trick for long periods of time,
it doesn't upgrade the status or the graphs during that period.
Any ideas, pointers or tips?
Regards
Vernon
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
_ _
NOTICE: This message and any attachments are confidential and may
contain copyright material of Australian Finance Group Limited or a
third party. It is intended solely for the purpose of the addressee and
any other named recipient. If you are not the intended recipient, any
use, distribution, disclosure or copying of this message is strictly
prohibited. The confidentiality attached to this message is not waived
or lost by reason of the mistaken transmission or delivery to any
unintended party. If you have received this message in error, please
notify the author immediately or contact Australian Finance Group on +61
8 9420 7888.
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
NOTICE: This message and any attachments are confidential and may contain copyright material
of Australian Finance Group Limited or a third party. It is intended solely for the purpose of the
addressee and any other named recipient. If you are not the intended recipient, any use,
distribution, disclosure or copying of this message is strictly prohibited. The confidentiality attached
to this message is not waived or lost by reason of the mistaken transmission or delivery to any
unintended party. If you have received this message in error, please notify the author immediately or
contact Australian Finance Group on +61 8 9420 7888.
list Henrik Størner
Hi Vernon, could you try removing the HEARTBEAT line from the first entry in hobbitlaunch.cfg ? It looks like your hobbitd process is being bounced frequently. I've seen that happen for no apparent reason when the heartbeat check has been enabled - on some systems. Regards, Henrik
list Vernon Everett
Hi Henrik I removed the HEARTBEAT ine, and restarted. No change. :-( In case it helps, I am runnig Mandrake 10.1 Regards Vernon
▸
-----Original Message-----
From: Henrik Stoerner [mailto:user-ce4a2c883f75@xymon.invalid] Sent: Friday, 1 July 2005 1:38 PM
To: user-ae9b8668bcde@xymon.invalid
Subject: Re: [hobbit] Status Unavailable
Hi Vernon,
could you try removing the HEARTBEAT line from the first entry in
hobbitlaunch.cfg ?
It looks like your hobbitd process is being bounced frequently.
I've seen that happen for no apparent reason when the heartbeat check
has been enabled - on some systems.
Regards,
Henrik
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
NOTICE: This message and any attachments are confidential and may contain copyright material of Australian Finance Group Limited or a third party. It is intended solely for the purpose of the addressee and any other named recipient. If you are not the intended recipient, any use, distribution, disclosure or copying of this message is strictly prohibited. The confidentiality attached
to this message is not waived or lost by reason of the mistaken transmission or delivery to any unintended party. If you have received this message in error, please notify the author immediately or contact Australian Finance Group on +61 8 9420 7888.
list Vernon Everett
Found this in /var/log/messages Jul 1 14:01:00 pengo CROND[4492]: (root) CMD (nice -n 19 run-parts /etc/cron.hourly) Jul 1 14:01:01 pengo msec: changed mode of /var/log/hobbit/hobbitlaunch.pid from 664 to 640 Jul 1 14:01:01 pengo msec: changed mode of /var/log/hobbit/hobbitd.pid from 664 to 640 Jul 1 14:01:01 pengo msec: changed mode of /var/log/wtmp from 664 to 640 Jul 1 14:01:01 pengo msec: changed group of /var/log/wtmp from utmp to adm Jul 1 14:01:01 pengo msec: changed mode of /var/log/Xorg.0.log from 644 to 640 Jul 1 14:01:01 pengo msec: changed group of /var/log/Xorg.0.log from root to adm Jul 1 14:06:48 pengo su(pam_unix)[4751]: session opened for user hobbit by root(uid=0) Jul 1 14:06:48 pengo su[4751]: pam_xauth: error creating temporary file `/usr/lib/hobbit/.xauthyyUFNl': Permission denied Jul 1 14:06:58 pengo su(pam_unix)[4751]: session closed for user hobbit Jul 1 14:23:13 pengo crontab[5400]: (root) LIST (root) Any help?
▸
-----Original Message-----
From: Henrik Stoerner [mailto:user-ce4a2c883f75@xymon.invalid]
Sent: Friday, 1 July 2005 1:38 PM
To: user-ae9b8668bcde@xymon.invalid
Subject: Re: [hobbit] Status Unavailable
Hi Vernon,
could you try removing the HEARTBEAT line from the first entry in
hobbitlaunch.cfg ?
It looks like your hobbitd process is being bounced frequently.
I've seen that happen for no apparent reason when the heartbeat check
has been enabled - on some systems.
Regards,
Henrik
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
NOTICE: This message and any attachments are confidential and may contain copyright material
of Australian Finance Group Limited or a third party. It is intended solely for the purpose of the
addressee and any other named recipient. If you are not the intended recipient, any use,
distribution, disclosure or copying of this message is strictly prohibited. The confidentiality attached
to this message is not waived or lost by reason of the mistaken transmission or delivery to any
unintended party. If you have received this message in error, please notify the author immediately or
contact Australian Finance Group on +61 8 9420 7888.
list Henrik Størner
▸
On Fri, Jul 01, 2005 at 01:51:47PM +0800, Vernon Everett wrote:
I removed the HEARTBEAT ine, and restarted. No change. :-( In case it helps, I am runnig Mandrake 10.1
OK - something similar did happen on my own system a few days ago, but it was so bizarre I wonder if it could happen on two boxes in the same week :-) The Linux kernel was leaking memory, so eventually it ran out of network bufferspace and Hobbit couldn't send responses anywhere. Could you try running "dmesg" and see if there are any "failed allocation" messages at the bottom ? This should really only list the messages you see during boot-up and any hardware detection that has happened. Also, send me a "vmstat 4 20" output from the box. If you run ~hobbit/server/bin/bb 127.0.0.1 "hobbitdboard" does that hang ? What if you do ~hobbit/server/bin/bb 127.0.0.1 "hobbitdboard host=YOUR.HOBBIT.HOSTNAME" Regards, Henrik
list Vernon Everett
Hi Henrik Thanks for helping on this. I rebooted this morning. Could the memory leak still effect me in that short time? No "failed allocation" in dmesg output. Do you want the full output? [root at pengo log]# vmstat 4 20 procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu---- r b swpd free buff cache si so bi bo in cs us sy id wa 0 0 0 67916 14428 92136 0 0 19 5 1025 161 1 1 98 1 0 0 0 67852 14428 92136 0 0 0 0 1024 150 0 1 99 0 0 0 0 67852 14436 92136 0 0 0 5 1031 157 0 1 99 0 0 0 0 67852 14444 92136 0 0 0 12 1028 148 0 0 100 0 0 0 0 67852 14444 92136 0 0 0 2 1025 152 0 0 99 0 0 0 0 67852 14448 92136 0 0 0 1 1024 154 0 1 99 0 0 0 0 67852 14448 92136 0 0 0 0 1026 145 0 1 100 0 0 0 0 67796 14448 92136 0 0 0 0 1028 157 1 1 99 0 0 0 0 67796 14448 92136 0 0 0 0 1023 149 0 0 100 0 0 0 0 67796 14456 92136 0 0 0 3 1024 155 1 1 99 0 0 0 0 67796 14456 92140 0 0 0 0 1037 157 0 1 99 0 0 0 0 67796 14468 92140 0 0 0 17 1026 150 0 1 99 0 0 0 0 67796 14468 92140 0 0 0 0 1022 157 0 1 99 0 0 0 0 67796 14476 92140 0 0 0 4 1022 148 0 0 100 0 0 0 0 67796 14476 92140 0 0 0 0 1023 157 1 1 99 0 0 0 0 67796 14476 92140 0 0 0 0 1021 152 0 1 100 0 0 0 0 67796 14476 92140 0 0 0 0 1019 147 1 0 99 0 0 0 0 67796 14476 92140 0 0 0 6 1026 153 0 1 99 0 0 0 0 67796 14492 92140 0 0 2 12 1023 151 0 0 92 8 0 0 0 67796 14492 92140 0 0 0 0 1024 155 0 1 99 0 All these commands returned to command prompt with the following error message. As user hobbit [hobbit at pengo hobbit]$ server/bin/bb 127.0.0.1 "hobbitdboard" 2005-07-01 15:21:45 Whoops ! bb failed to send message - timeout [hobbit at pengo hobbit]$ server/bin/bb 127.0.0.1 "hobbitdboard host=pengo" 2005-07-01 15:21:00 Whoops ! bb failed to send message - timeout [hobbit at pengo hobbit]$ server/bin/bb 127.0.0.1 "hobbitdboard host=pengo.afgonlin.com.au" 2005-07-01 15:21:30 Whoops ! bb failed to send message - timeout As root [root at pengo log]# /usr/lib/hobbit/server/bin/bb 127.0.0.1 "hobbitdboard" 2005-07-01 15:17:41 Whoops ! bb failed to send message - timeout [root at pengo log]# /usr/lib/hobbit/server/bin/bb 127.0.0.1 "hobbitdboard host=pengo" 2005-07-01 15:18:35 Whoops ! bb failed to send message - timeout [root at pengo log]# /usr/lib/hobbit/server/bin/bb 127.0.0.1 "hobbitdboard host=pengo.afgonlin.com.au" 2005-07-01 15:18:48 Whoops ! bb failed to send message - timeout
▸
---
-----Original Message-----
From: Henrik Stoerner [mailto:user-ce4a2c883f75@xymon.invalid]
Sent: Friday, 1 July 2005 2:40 PM
To: user-ae9b8668bcde@xymon.invalid
Subject: Re: [hobbit] Status Unavailable
On Fri, Jul 01, 2005 at 01:51:47PM +0800, Vernon Everett wrote:I removed the HEARTBEAT ine, and restarted. No change. :-( In case it helps, I am runnig Mandrake 10.1
OK - something similar did happen on my own system a few days ago, but it was so bizarre I wonder if it could happen on two boxes in the same week :-) The Linux kernel was leaking memory, so eventually it ran out of network bufferspace and Hobbit couldn't send responses anywhere. Could you try running "dmesg" and see if there are any "failed allocation" messages at the bottom ? This should really only list the messages you see during boot-up and any hardware detection that has happened. Also, send me a "vmstat 4 20" output from the box. If you run ~hobbit/server/bin/bb 127.0.0.1 "hobbitdboard" does that hang ? What if you do ~hobbit/server/bin/bb 127.0.0.1 "hobbitdboard host=YOUR.HOBBIT.HOSTNAME" Regards, Henrik _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ NOTICE: This message and any attachments are confidential and may contain copyright material of Australian Finance Group Limited or a third party. It is intended solely for the purpose of the addressee and any other named recipient. If you are not the intended recipient, any use, distribution, disclosure or copying of this message is strictly prohibited. The confidentiality attached to this message is not waived or lost by reason of the mistaken transmission or delivery to any unintended party. If you have received this message in error, please notify the author immediately or contact Australian Finance Group on +61 8 9420 7888.
list Henrik Størner
▸
On Fri, Jul 01, 2005 at 03:25:30PM +0800, Vernon Everett wrote:
Thanks for helping on this. I rebooted this morning. Could the memory leak still effect me in that short time?
Probably not. Just wanted to rule out this possibility.
▸
No "failed allocation" in dmesg output. Do you want the full output?
No, I dont think that is necessary.
[root at pengo log]# vmstat 4 20
And your system is mostly idle with no swap or disk activity.
▸
[hobbit at pengo hobbit]$ server/bin/bb 127.0.0.1 "hobbitdboard" 2005-07-01 15:21:45 Whoops ! bb failed to send message - timeout
Could you try running "strace -p <process-ID of the hobbitd process>" for a minute or two and send me the output, then do a "kill -6 <process-id>" and mail me the core-file from ~hobbit/server/tmp/ together with the ~hobbit/server/bin/hobbitd file ? Also, after this try adding a "--debug" to the hobbitd commandline in hobbitlaunch.cfg. Let it run for a while and then mail me the hobbitd.log file. This bug sounds a bit nasty, I think .... Regards, Henrik
list Vernon Everett
Hi Henrik
It should be idle. All the system does is run hobbit. :-)
Hobbitd is currently dead in the water.
[root at pengo log]# strace -p 3025
Process 3025 attached - interrupt to quit
futex(0x40141b20, FUTEX_WAIT, 2, NULL
And it's been like this a while.
When I did the kill -6 I got this.
[root at pengo log]# strace -p 3025
Process 3025 attached - interrupt to quit
futex(0x40141b20, FUTEX_WAIT, 2, NULL) = -1 EINTR (Interrupted
system call)
--- SIGABRT (Aborted) @ 0 (0) ---
Process 3025 detached
Which I suppose was expected :-)
I restarted it, and got this.
[root at pengo etc]# strace -p 9223
Process 9223 attached - interrupt to quit
semop(32769, 0xbfffe3a0, 1
Nope, there is nothing I forgot to cut and paste.
That really was it.
And this shit just gets stranger and stranger.
It isn't dumping core.
I hit it with a kill -6 and nothing happens.
I then thought maybe we were both mistaken, and had the command wrong or
my linux was defaulted to not core, so I started vi in a session and did
a kill -6 on that. That dumped?!
Hobbit isn't dumping.
I rebooted and tried again.
I managed to get a nice strace output - see attached - but still no damn
core.
OK, I added debug, and restarted.
When I went to check the logs, I found this in hobbitlaunch.log.
---snip---
2005-07-01 16:37:21 Loading tasklist configuration from
/usr/lib/hobbit/server/etc/hobbitlaunch.cfg
2005-07-01 16:37:21 Loading hostnames
2005-07-01 16:37:21 Loading saved state
2005-07-01 16:37:21 Setting up network listener on 0.0.0.0:1984
2005-07-01 16:37:21 Cannot bind to listen socket (Address already in
use)
2005-07-01 16:37:21 Task hobbitd started with PID 4761
2005-07-01 16:37:26 Task hobbitd terminated, status 1
2005-07-01 16:37:26 Loading hostnames
2005-07-01 16:37:26 Loading saved state
2005-07-01 16:37:26 Task hobbitd started with PID 4765
2005-07-01 16:37:26 Setting up network listener on 0.0.0.0:1984
2005-07-01 16:37:26 Cannot bind to listen socket (Address already in
use)
2005-07-01 16:37:26 Task hobbitd terminated, status 1
2005-07-01 16:37:31 Loading hostnames
2005-07-01 16:37:31 Loading saved state
2005-07-01 16:37:31 Task hobbitd started with PID 4770
2005-07-01 16:37:31 Setting up network listener on 0.0.0.0:1984
2005-07-01 16:37:31 Cannot bind to listen socket (Address already in
use)
2005-07-01 16:37:31 Task hobbitd terminated, status 1
2005-07-01 16:37:36 Task hobbitd started with PID 4774
2005-07-01 16:37:36 Loading hostnames
2005-07-01 16:37:36 Loading saved state
2005-07-01 16:37:36 Setting up network listener on 0.0.0.0:1984
2005-07-01 16:37:36 Cannot bind to listen socket (Address already in
use)
2005-07-01 16:37:36 Task hobbitd terminated, status 1
2005-07-01 16:37:41 Task hobbitd started with PID 4778
2005-07-01 16:37:41 Loading hostnames
2005-07-01 16:37:41 Loading saved state
2005-07-01 16:37:41 Setting up network listener on 0.0.0.0:1984
2005-07-01 16:37:41 Cannot bind to listen socket (Address already in
use)
2005-07-01 16:37:41 Task hobbitd terminated, status 1
2005-07-01 16:37:46 Task hobbitd started with PID 4783
2005-07-01 16:37:46 Loading hostnames
2005-07-01 16:37:46 Loading saved state
2005-07-01 16:37:46 Setting up network listener on 0.0.0.0:1984
2005-07-01 16:37:46 Cannot bind to listen socket (Address already in
use)
2005-07-01 16:37:46 Task hobbitd terminated, status 1
---snip---
Looks like a clue.
I will add the output of netstat -a
Got the hobbitd.log file for you too.
Let me know if there is anything else I can get you.
Regards
Vernon
P.S. Your cold one is quickly becoming many cold ones if you ever get to
Perth
▸
-----Original Message-----
From: Henrik Stoerner [mailto:user-ce4a2c883f75@xymon.invalid]
Sent: Friday, 1 July 2005 3:38 PM
To: user-ae9b8668bcde@xymon.invalid
Subject: Re: [hobbit] Status Unavailable
On Fri, Jul 01, 2005 at 03:25:30PM +0800, Vernon Everett wrote:Thanks for helping on this. I rebooted this morning. Could the memory leak still effect me in that
short time?
Probably not. Just wanted to rule out this possibility.
No "failed allocation" in dmesg output. Do you want the full output?
No, I dont think that is necessary.
[root at pengo log]# vmstat 4 20
And your system is mostly idle with no swap or disk activity.
[hobbit at pengo hobbit]$ server/bin/bb 127.0.0.1 "hobbitdboard" 2005-07-01 15:21:45 Whoops ! bb failed to send message - timeout
Could you try running "strace -p <process-ID of the hobbitd process>" for a minute or two and send me the output, then do a "kill -6 <process-id>" and mail me the core-file from ~hobbit/server/tmp/ together with the ~hobbit/server/bin/hobbitd file ? Also, after this try adding a "--debug" to the hobbitd commandline in hobbitlaunch.cfg. Let it run for a while and then mail me the hobbitd.log file. This bug sounds a bit nasty, I think .... Regards, Henrik _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ NOTICE: This message and any attachments are confidential and may contain copyright material of Australian Finance Group Limited or a third party. It is intended solely for the purpose of the addressee and any other named recipient. If you are not the intended recipient, any use, distribution, disclosure or copying of this message is strictly prohibited. The confidentiality attached to this message is not waived or lost by reason of the mistaken transmission or delivery to any unintended party. If you have received this message in error, please notify the author immediately or contact Australian Finance Group on +61 8 9420 7888.
list Stefan Loos
Hello Vernon, can you tell me, if there is anything like "hobbitd status board not available" in the bb-display.log? Regards, Stefan <br><br><br>>From: "Vernon Everett" <user-99fc6b22a3a3@xymon.invalid><br>>Reply-To: user-ae9b8668bcde@xymon.invalid<br>>To: <user-ae9b8668bcde@xymon.invalid><br>>Subject: RE: [hobbit] Status Unavailable<br>>Date: Fri, 1 Jul 2005 16:56:38 +0800<br>><br>>Hi Henrik<br>><br>>It should be idle. All the system does is run hobbit. :-)<br>><br>>Hobbitd is currently dead in the water.<br>> [root at pengo log]# strace -p 3025<br>> Process 3025 attached - interrupt to quit<br>> futex(0x40141b20, FUTEX_WAIT, 2, NULL<br>><br>>And it's been like this a while.<br>>When I did the kill -6 I got this.<br>> [root at pengo log]# strace -p 3025<br>> Process 3025 attached - interrupt to quit<br>> futex(0x40141b20, FUTEX_WAIT, 2, NULL) = -1 EINTR (Interrupted<br>>system call)<br>> --- SIGABRT (Aborted) @ 0 (0) ---<br>> Process 3025 detached<br>>Which I suppose was expected :-)<br>><br>>I restarted it, and got this.<br>> [root at pengo etc]# strace -p 9223<br>> Process 9223 attached - interrupt to quit<br>> semop(32769, 0xbfffe3a0, 1<br>>Nope, there is nothing I forgot to cut and paste.<br>>That really was it.<br>><br>>And this shit just gets stranger and stranger.<br>>It isn't dumping core.<br>>I hit it with a kill -6 and nothing happens.<br>>I then thought maybe we were both mistaken, and had the command wrong or<br>>my linux was defaulted to not core, so I started vi in a session and did<br>>a kill -6 on that. That dumped?!<br>>Hobbit isn't dumping.<br>><br>>I rebooted and tried again.<br>>I managed to get a nice strace output - see attached - but still no damn<br>>core.<br>><br>>OK, I added debug, and restarted.<br>>When I went to check the logs, I found this in hobbitlaunch.log.<br>>---snip---<br>>2005-07-01 16:37:21 Loading tasklist configuration from<br>>/usr/lib/hobbit/server/etc/hobbitlaunch.cfg<br>>2005-07-01 16:37:21 Loading hostnames<br>>2005-07-01 16:37:21 Loading saved state<br>>2005-07-01 16:37:21 Setting up network listener on 0.0.0.0:1984<br>>2005-07-01 16:37:21 Cannot bind to listen socket (Address already in<br>>use)<br>>2005-07-01 16:37:21 Task hobbitd started with PID 4761<br>>2005-07-01 16:37:26 Task hobbitd terminated, status 1<br>>2005-07-01 16:37:26 Loading hostnames<br>>2005-07-01 16:37:26 Loading saved state<br>>2005-07-01 16:37:26 Task hobbitd started with PID 4765<br>>2005-07-01 16:37:26 Setting up network listener on 0.0.0.0:1984<br>>2005-07-01 16:37:26 Cannot bind to listen socket (Address already in<br>>use)<br>>2005-07-01 16:37:26 Task hobbitd terminated, status 1<br>>2005-07-01 16:37:31 Loading hostnames<br>>2005-07-01 16:37:31 Loading saved state<br>>2005-07-01 16:37:31 Task hobbitd started with PID 4770<br>>2005-07-01 16:37:31 Setting up network listener on 0.0.0.0:1984<br>>2005-07-01 16:37:31 Cannot bind to listen socket (Address already in<br>>use)<br>>2005-07-01 16:37:31 Task hobbitd terminated, status 1<br>>2005-07-01 16:37:36 Task hobbitd started with PID 4774<br>>2005-07-01 16:37:36 Loading hostnames<br>>2005-07-01 16:37:36 Loading saved state<br>>2005-07-01 16:37:36 Setting up network listener on 0.0.0.0:1984<br>>2005-07-01 16:37:36 Cannot bind to listen socket (Address already in<br>>use)<br>>2005-07-01 16:37:36 Task hobbitd terminated, status 1<br>>2005-07-01 16:37:41 Task hobbitd started with PID 4778<br>>2005-07-01 16:37:41 Loading hostnames<br>>2005-07-01 16:37:41 Loading saved state<br>>2005-07-01 16:37:41 Setting up network listener on 0.0.0.0:1984<br>>2005-07-01 16:37:41 Cannot bind to listen socket (Address already in<br>>use)<br>>2005-07-01 16:37:41 Task hobbitd terminated, status 1<br>>2005-07-01 16:37:46 Task hobbitd started with PID 4783<br>>2005-07-01 16:37:46 Loading hostnames<br>>2005-07-01 16:37:46 Loading saved state<br>>2005-07-01 16:37:46 Setting up network listener on 0.0.0.0:1984<br>>2005-07-01 16:37:46 Cannot bind to listen socket (Address already in<br>>use)<br>>2005-07-01 16:37:46 Task hobbitd terminated, status 1<br>>---snip---<br>><br>>Looks like a clue.<br>>I will add the output of netstat -a<br>><br>>Got the hobbitd.log file for you too.<br>><br>>Let me know if there is anything else I can get you.<br>><br>>Regards<br>> Vernon<br>><br>>P.S. Your cold one is quickly becoming many cold ones if you ever get to<br>>Perth<br>><br>><br>><br>><br>><br>>-----Original Message-----<br>>From: Henrik Stoerner [mailto:user-ce4a2c883f75@xymon.invalid]<br>>Sent: Friday, 1 July 2005 3:38 PM<br>>To: user-ae9b8668bcde@xymon.invalid<br>>Subject: Re: [hobbit] Status Unavailable<br>><br>>On Fri, Jul 01, 2005 at 03:25:30PM +0800, Vernon Everett wrote:<br>> > Thanks for helping on this.<br>> > I rebooted this morning. Could the memory leak still effect me in that<br>><br>> > short time?<br>><br>>Probably not. Just wanted to rule out this possibility.<br>><br>> > No "failed allocation" in dmesg output.<br>> > Do you want the full output?<br>><br>>No, I dont think that is necessary.<br>><br>> > [root at pengo log]# vmstat 4 20<br>><br>>And your system is mostly idle with no swap or disk activity.<br>><br>> > [hobbit at pengo hobbit]$ server/bin/bb 127.0.0.1 "hobbitdboard"<br>> > 2005-07-01 15:21:45 Whoops ! bb failed to send message - timeout<br>><br>>Could you try running "strace -p <process-ID of the hobbitd process>"<br>>for a minute or two and send me the output, then do a "kill -6<br>><process-id>" and mail me the core-file from ~hobbit/server/tmp/<br>>together with the ~hobbit/server/bin/hobbitd file ?<br>><br>>Also, after this try adding a "--debug" to the hobbitd commandline in<br>>hobbitlaunch.cfg. Let it run for a while and then mail me the<br>>hobbitd.log file.<br>><br>>This bug sounds a bit nasty, I think ....<br>><br>><br>>Regards,<br>>Henrik<br>><br>><br>>To unsubscribe from the hobbit list, send an e-mail to<br>>user-095ef1c764a2@xymon.invalid<br>><br>><br>>_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _<br>><br>>NOTICE: This message and any attachments are confidential and may contain copyright material<br>>of Australian Finance Group Limited or a third party. It is intended solely for the purpose of the<br>>addressee and any other named recipient. If you are not the intended recipient, any use,<br>>distribution, disclosure or copying of this message is strictly prohibited. The confidentiality attached<br>>to this message is not waived or lost by reason of the mistaken transmission or delivery to any<br>>unintended party. If you have received this message in error, please notify the author immediately or<br>>contact Australian Finance Group on +61 8 9420 7888.<br>><br>><br>>To unsubscribe from the hobbit list, send an e-mail to<br>>user-095ef1c764a2@xymon.invalid<br>><br>><br>
list Vernon Everett
Yes. Quite often. ---snip--- 2005-07-04 14:09:17 Whoops ! bb failed to send message - timeout 2005-07-04 14:09:17 Could not get the Hobbit statuslog-list 2005-07-04 14:09:50 Whoops ! bb failed to send message - timeout 2005-07-04 14:09:50 hobbitd status-board not available 2005-07-04 14:10:49 Whoops ! bb failed to send message - timeout 2005-07-04 14:10:49 hobbitd status-board not available 2005-07-04 14:11:49 Whoops ! bb failed to send message - timeout 2005-07-04 14:11:49 hobbitd status-board not available 2005-07-04 14:12:52 Whoops ! bb failed to send message - timeout 2005-07-04 14:12:52 hobbitd status-board not available 2005-07-04 14:13:50 Whoops ! bb failed to send message - timeout 2005-07-04 14:13:50 hobbitd status-board not available 2005-07-04 14:14:50 Whoops ! bb failed to send message - timeout 2005-07-04 14:14:50 hobbitd status-board not available 2005-07-04 14:16:22 Whoops ! bb failed to send message - timeout 2005-07-04 14:16:22 hobbitd status-board not available 2005-07-04 14:16:22 WARNING: Runtime 61 longer than BBSLEEP (60) 2005-07-04 14:16:52 Whoops ! bb failed to send message - timeout 2005-07-04 14:16:52 hobbitd status-board not available 2005-07-04 14:17:52 Whoops ! bb failed to send message - timeout 2005-07-04 14:17:52 hobbitd status-board not available 2005-07-04 14:18:52 Whoops ! bb failed to send message - timeout 2005-07-04 14:18:52 hobbitd status-board not available 2005-07-04 14:19:52 Whoops ! bb failed to send message - timeout 2005-07-04 14:19:52 hobbitd status-board not available 2005-07-04 14:21:26 Whoops ! bb failed to send message - timeout 2005-07-04 14:21:26 hobbitd status-board not available 2005-07-04 14:21:26 WARNING: Runtime 61 longer than BBSLEEP (60) 2005-07-04 14:21:59 Whoops ! bb failed to send message - timeout 2005-07-04 14:21:59 hobbitd status-board not available ---snip---
▸
-----Original Message-----
From: Stefan Loos [mailto:user-dea24d965402@xymon.invalid]
Sent: Monday, 4 July 2005 2:16 PM
To: user-ae9b8668bcde@xymon.invalid
Subject: RE: [hobbit] Status Unavailable
Hello Vernon,
can you tell me, if there is anything like "hobbitd status board not
available" in the bb-display.log?
Regards,
Stefan
<br><br><br>>From: "Vernon Everett"
<user-99fc6b22a3a3@xymon.invalid><br>>Reply-To:
user-ae9b8668bcde@xymon.invalid<br>>To: <user-ae9b8668bcde@xymon.invalid><br>>Subject: RE:
[hobbit] Status Unavailable<br>>Date: Fri, 1 Jul 2005 16:56:38
+0800<br>><br>>Hi Henrik<br>><br>>It should be idle. All the
system does is run hobbit. :-)<br>><br>>Hobbitd is currently dead
in
the water.<br>> [root at pengo log]# strace -p 3025<br>>
Process 3025
attached - interrupt to quit<br>> futex(0x40141b20, FUTEX_WAIT, 2,
NULL<br>><br>>And it's been like this a while.<br>>When I did
the
kill -6 I got this.<br>> [root at pengo log]# strace -p 3025<br>>
Process
3025 attached - interrupt to quit<br>> futex(0x40141b20,
FUTEX_WAIT, 2,
NULL) = -1 EINTR (Interrupted<br>>system call)<br>> ---
SIGABRT
(Aborted) @ 0 (0) ---<br>> Process 3025 detached<br>>Which I
suppose
was expected :-)<br>><br>>I restarted it, and got
this.<br>> [root at pengo etc]# strace -p 9223<br>> Process
9223 attached
- interrupt to quit<br>> semop(32769, 0xbfffe3a0, 1<br>>Nope,
there is
nothing I forgot to cut and paste.<br>>That really was
it.<br>><br>>And this shit just gets stranger and
stranger.<br>>It isn't dumping core.<br>>I hit it with a kill -6
and nothing happens.<br>>I then thought maybe we were both mistaken,
and had the command wrong or<br>>my linux was defaulted to not core,
so I started vi in a session and did<br>>a kill -6 on that. That
dumped?!<br>>Hobbit isn't dumping.<br>><br>>I rebooted and
tried again.<br>>I managed to get a nice strace output - see attached
- but still no damn<br>>core.<br>><br>>OK, I added debug, and
restarted.<br>>When I went to check the logs, I found this in
hobbitlaunch.log.<br>>---snip---<br>>2005-07-01 16:37:21 Loading
tasklist configuration
from<br>>/usr/lib/hobbit/server/etc/hobbitlaunch.cfg<br>>2005-07-0
1
▸
16:37:21 Loading hostnames<br>>2005-07-01 16:37:21 Loading saved
state<br>>2005-07-01 16:37:21 Setting up network listener on
0.0.0.0:1984<br>>2005-07-01 16:37:21 Cannot bind to listen socket
(Address already in<br>>use)<br>>2005-07-01 16:37:21 Task hobbitd
started with PID 4761<br>>2005-07-01 16:37:26 Task hobbitd
terminated, status 1<br>>2005-07-01 16:37:26 Loading
hostnames<br>>2005-07-01
16:37:26 Loading saved state<br>>2005-07-01 16:37:26 Task hobbitd
started with PID 4765<br>>2005-07-01 16:37:26 Setting up network
listener on
0.0.0.0:1984<br>>2005-07-01 16:37:26 Cannot bind to listen socket
(Address already in<br>>use)<br>>2005-07-01 16:37:26 Task hobbitd
terminated, status 1<br>>2005-07-01 16:37:31 Loading
hostnames<br>>2005-07-01 16:37:31 Loading saved
state<br>>2005-07-01
16:37:31 Task hobbitd started with PID 4770<br>>2005-07-01 16:37:31
Setting up network listener on 0.0.0.0:1984<br>>2005-07-01 16:37:31
Cannot bind to listen socket (Address already
in<br>>use)<br>>2005-07-01 16:37:31 Task hobbitd terminated,
status
1<br>>2005-07-01 16:37:36 Task hobbitd started with PID
4774<br>>2005-07-01 16:37:36 Loading hostnames<br>>2005-07-01
16:37:36 Loading saved state<br>>2005-07-01 16:37:36 Setting up
network listener on 0.0.0.0:1984<br>>2005-07-01 16:37:36 Cannot bind
to listen socket (Address already in<br>>use)<br>>2005-07-01
16:37:36 Task hobbitd terminated, status 1<br>>2005-07-01 16:37:41
Task hobbitd started with PID 4778<br>>2005-07-01 16:37:41 Loading
hostnames<br>>2005-07-01
16:37:41 Loading saved state<br>>2005-07-01 16:37:41 Setting up
network listener on 0.0.0.0:1984<br>>2005-07-01 16:37:41 Cannot bind
to listen socket (Address already in<br>>use)<br>>2005-07-01
16:37:41 Task hobbitd terminated, status 1<br>>2005-07-01 16:37:46
Task hobbitd started with PID 4783<br>>2005-07-01 16:37:46 Loading
hostnames<br>>2005-07-01
16:37:46 Loading saved state<br>>2005-07-01 16:37:46 Setting up
network listener on 0.0.0.0:1984<br>>2005-07-01 16:37:46 Cannot bind
to listen socket (Address already in<br>>use)<br>>2005-07-01
16:37:46 Task hobbitd terminated, status
1<br>>---snip---<br>><br>>Looks like a clue.<br>>I will add
the output of netstat -a<br>><br>>Got the hobbitd.log file for you
too.<br>><br>>Let me know if there is
anything else I can get you.<br>><br>>Regards<br>>
Vernon<br>><br>>P.S. Your cold one is quickly becoming many cold
ones if you ever get
to<br>>Perth<br>><br>><br>><br>><br>><br>>-----Orig
inal
▸
Message-----<br>>From: Henrik Stoerner
[mailto:user-ce4a2c883f75@xymon.invalid]<br>>Sent: Friday, 1 July 2005 3:38
PM<br>>To:
user-ae9b8668bcde@xymon.invalid<br>>Subject: Re: [hobbit] Status
Unavailable<br>><br>>On Fri, Jul 01, 2005 at 03:25:30PM +0800,
Vernon Everett wrote:<br>> > Thanks for helping on this.<br>>
> I rebooted this morning. Could the memory leak still effect me in
that<br>><br>> > short time?<br>><br>>Probably not. Just
wanted to rule out this possibility.<br>><br>> > No
"failed allocation" in dmesg output.<br>> > Do you want
the full output?<br>><br>>No, I dont think that is
necessary.<br>><br>> > [root at pengo log]# vmstat 4
20<br>><br>>And your system is mostly idle with no swap or disk
activity.<br>><br>> > [hobbit at pengo hobbit]$ server/bin/bb
127.0.0.1 "hobbitdboard"<br>> >
2005-07-01 15:21:45 Whoops ! bb failed to send message -
timeout<br>><br>>Could you try running "strace -p
<process-ID of the hobbitd process>"<br>>for a minute or
two and send me the output, then do a "kill
-6<br>><process-id>" and mail me the core-file from
~hobbit/server/tmp/<br>>together with the ~hobbit/server/bin/hobbitd
file ?<br>><br>>Also, after this try adding a "--debug"
to the hobbitd commandline in<br>>hobbitlaunch.cfg.
Let it run for a while and then mail me the<br>>hobbitd.log
file.<br>><br>>This bug sounds a bit nasty, I think
....<br>><br>><br>>Regards,<br>>Henrik<br>><br>><br>&g
t;To
▸
unsubscribe from the hobbit list, send an e-mail
to<br>>user-095ef1c764a2@xymon.invalid<br>><br>><br>>_ _ _ _ _ _
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
_<br>><br>>NOTICE: This message and any attachments are
confidential and may contain copyright material<br>>of Australian
Finance Group Limited or a third party. It is intended solely for the
purpose of the<br>>addressee and any other named recipient. If you
are not the intended recipient, any use,<br>>distribution, disclosure
or copying of this message is strictly prohibited. The confidentiality
attached<br>>to this message is not waived or lost by reason of the
mistaken transmission or delivery to any<br>>unintended party. If you
have received this message in error, please notify the author
immediately or<br>>contact Australian Finance Group on +61 8 9420
7888.<br>><br>><br>>To unsubscribe from the hobbit list, send
an e-mail to<br>>user-095ef1c764a2@xymon.invalid<br>><br>><br>
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
NOTICE: This message and any attachments are confidential and may contain copyright material
of Australian Finance Group Limited or a third party. It is intended solely for the purpose of the
addressee and any other named recipient. If you are not the intended recipient, any use,
distribution, disclosure or copying of this message is strictly prohibited. The confidentiality attached
to this message is not waived or lost by reason of the mistaken transmission or delivery to any
unintended party. If you have received this message in error, please notify the author immediately or
contact Australian Finance Group on +61 8 9420 7888.
list Stefan Loos
And can you stop the hobbit server with hobbit.sh or is one process still running after that? <br><br><br>>From: "Vernon Everett" <user-99fc6b22a3a3@xymon.invalid><br>>Reply-To: user-ae9b8668bcde@xymon.invalid<br>>To: <user-ae9b8668bcde@xymon.invalid><br>>Subject: RE: [hobbit] Status Unavailable<br>>Date: Mon, 4 Jul 2005 14:23:56 +0800<br>><br>>Yes.<br>>Quite often.<br>>---snip---<br>>2005-07-04 14:09:17 Whoops ! bb failed to send message - timeout<br>>2005-07-04 14:09:17 Could not get the Hobbit statuslog-list<br>>2005-07-04 14:09:50 Whoops ! bb failed to send message - timeout<br>>2005-07-04 14:09:50 hobbitd status-board not available<br>>2005-07-04 14:10:49 Whoops ! bb failed to send message - timeout<br>>2005-07-04 14:10:49 hobbitd status-board not available<br>>2005-07-04 14:11:49 Whoops ! bb failed to send message - timeout<br>>2005-07-04 14:11:49 hobbitd status-board not available<br>>2005-07-04 14:12:52 Whoops ! bb failed to send message - timeout<br>>2005-07-04 14:12:52 hobbitd status-board not available<br>>2005-07-04 14:13:50 Whoops ! bb failed to send message - timeout<br>>2005-07-04 14:13:50 hobbitd status-board not available<br>>2005-07-04 14:14:50 Whoops ! bb failed to send message - timeout<br>>2005-07-04 14:14:50 hobbitd status-board not available<br>>2005-07-04 14:16:22 Whoops ! bb failed to send message - timeout<br>>2005-07-04 14:16:22 hobbitd status-board not available<br>>2005-07-04 14:16:22 WARNING: Runtime 61 longer than BBSLEEP (60)<br>>2005-07-04 14:16:52 Whoops ! bb failed to send message - timeout<br>>2005-07-04 14:16:52 hobbitd status-board not available<br>>2005-07-04 14:17:52 Whoops ! bb failed to send message - timeout<br>>2005-07-04 14:17:52 hobbitd status-board not available<br>>2005-07-04 14:18:52 Whoops ! bb failed to send message - timeout<br>>2005-07-04 14:18:52 hobbitd status-board not available<br>>2005-07-04 14:19:52 Whoops ! bb failed to send message - timeout<br>>2005-07-04 14:19:52 hobbitd status-board not available<br>>2005-07-04 14:21:26 Whoops ! bb failed to send message - timeout<br>>2005-07-04 14:21:26 hobbitd status-board not available<br>>2005-07-04 14:21:26 WARNING: Runtime 61 longer than BBSLEEP (60)<br>>2005-07-04 14:21:59 Whoops ! bb failed to send message - timeout<br>>2005-07-04 14:21:59 hobbitd status-board not available<br>>---snip---<br>><br>><br>>-----Original Message-----<br>>From: Stefan Loos [mailto:user-dea24d965402@xymon.invalid]<br>>Sent: Monday, 4 July 2005 2:16 PM<br>>To: user-ae9b8668bcde@xymon.invalid<br>>Subject: RE: [hobbit] Status Unavailable<br>><br>>Hello Vernon,<br>><br>>can you tell me, if there is anything like "hobbitd status board not<br>>available" in the bb-display.log?<br>><br>>Regards,<br>><br>>Stefan<br>><br>><br><br><br>&gt;From: &quot;Vernon Everett&quot;<br>>&lt;user-99fc6b22a3a3@xymon.invalid&gt;<br>&gt;Reply-To:<br>>user-ae9b8668bcde@xymon.invalid<br>&gt;To: &lt;user-ae9b8668bcde@xymon.invalid&gt;<br>&gt;Subject: RE:<br>>[hobbit] Status Unavailable<br>&gt;Date: Fri, 1 Jul 2005 16:56:38<br>>+0800<br>&gt;<br>&gt;Hi Henrik<br>&gt;<br>&gt;It should be idle. All the<br>>system does is run hobbit. :-)<br>&gt;<br>&gt;Hobbitd is currently dead<br>>in<br>>the water.<br>&gt; [root at pengo log]# strace -p 3025<br>&gt;<br>>Process 3025<br>>attached - interrupt to quit<br>&gt; futex(0x40141b20, FUTEX_WAIT, 2,<br>><br>>NULL<br>&gt;<br>&gt;And it's been like this a while.<br>&gt;When I did<br>>the<br>>kill -6 I got this.<br>&gt; [root at pengo log]# strace -p 3025<br>&gt;<br>>Process<br>>3025 attached - interrupt to quit<br>&gt; futex(0x40141b20,<br>>FUTEX_WAIT, 2,<br>>NULL) = -1 EINTR (Interrupted<br>&gt;system call)<br>&gt; ---<br>>SIGABRT<br>>(Aborted) @ 0 (0) ---<br>&gt; Process 3025 detached<br>&gt;Which I<br>>suppose<br>>was expected :-)<br>&gt;<br>&gt;I restarted it, and got<br>>this.<br>&gt; [root at pengo etc]# strace -p 9223<br>&gt; Process<br>>9223 attached<br>>- interrupt to quit<br>&gt; semop(32769, 0xbfffe3a0, 1<br>&gt;Nope,<br>>there is<br>>nothing I forgot to cut and paste.<br>&gt;That really was<br>>it.<br>&gt;<br>&gt;And this shit just gets stranger and<br>>stranger.<br>&gt;It isn't dumping core.<br>&gt;I hit it with a kill -6<br>>and nothing happens.<br>&gt;I then thought maybe we were both mistaken,<br>>and had the command wrong or<br>&gt;my linux was defaulted to not core,<br>>so I started vi in a session and did<br>&gt;a kill -6 on that. That<br>>dumped?!<br>&gt;Hobbit isn't dumping.<br>&gt;<br>&gt;I rebooted and<br>>tried again.<br>&gt;I managed to get a nice strace output - see attached<br>>- but still no damn<br>&gt;core.<br>&gt;<br>&gt;OK, I added debug, and<br>>restarted.<br>&gt;When I went to check the logs, I found this in<br>>hobbitlaunch.log.<br>&gt;---snip---<br>&gt;2005-07-01 16:37:21 Loading<br>>tasklist configuration<br>>from<br>&gt;/usr/lib/hobbit/server/etc/hobbitlaunch.cfg<br>&gt;2005-07-0<br>>1<br>>16:37:21 Loading hostnames<br>&gt;2005-07-01 16:37:21 Loading saved<br>>state<br>&gt;2005-07-01 16:37:21 Setting up network listener on<br>>0.0.0.0:1984<br>&gt;2005-07-01 16:37:21 Cannot bind to listen socket<br>>(Address already in<br>&gt;use)<br>&gt;2005-07-01 16:37:21 Task hobbitd<br>>started with PID 4761<br>&gt;2005-07-01 16:37:26 Task hobbitd<br>>terminated, status 1<br>&gt;2005-07-01 16:37:26 Loading<br>>hostnames<br>&gt;2005-07-01<br>>16:37:26 Loading saved state<br>&gt;2005-07-01 16:37:26 Task hobbitd<br>>started with PID 4765<br>&gt;2005-07-01 16:37:26 Setting up network<br>>listener on<br>>0.0.0.0:1984<br>&gt;2005-07-01 16:37:26 Cannot bind to listen socket<br>>(Address already in<br>&gt;use)<br>&gt;2005-07-01 16:37:26 Task hobbitd<br>>terminated, status 1<br>&gt;2005-07-01 16:37:31 Loading<br>>hostnames<br>&gt;2005-07-01 16:37:31 Loading saved<br>>state<br>&gt;2005-07-01<br>>16:37:31 Task hobbitd started with PID 4770<br>&gt;2005-07-01 16:37:31<br>>Setting up network listener on 0.0.0.0:1984<br>&gt;2005-07-01 16:37:31<br>>Cannot bind to listen socket (Address already<br>>in<br>&gt;use)<br>&gt;2005-07-01 16:37:31 Task hobbitd terminated,<br>>status<br>>1<br>&gt;2005-07-01 16:37:36 Task hobbitd started with PID<br>>4774<br>&gt;2005-07-01 16:37:36 Loading hostnames<br>&gt;2005-07-01<br>>16:37:36 Loading saved state<br>&gt;2005-07-01 16:37:36 Setting up<br>>network listener on 0.0.0.0:1984<br>&gt;2005-07-01 16:37:36 Cannot bind<br>>to listen socket (Address already in<br>&gt;use)<br>&gt;2005-07-01<br>>16:37:36 Task hobbitd terminated, status 1<br>&gt;2005-07-01 16:37:41<br>>Task hobbitd started with PID 4778<br>&gt;2005-07-01 16:37:41 Loading<br>>hostnames<br>&gt;2005-07-01<br>>16:37:41 Loading saved state<br>&gt;2005-07-01 16:37:41 Setting up<br>>network listener on 0.0.0.0:1984<br>&gt;2005-07-01 16:37:41 Cannot bind<br>>to listen socket (Address already in<br>&gt;use)<br>&gt;2005-07-01<br>>16:37:41 Task hobbitd terminated, status 1<br>&gt;2005-07-01 16:37:46<br>>Task hobbitd started with PID 4783<br>&gt;2005-07-01 16:37:46 Loading<br>>hostnames<br>&gt;2005-07-01<br>>16:37:46 Loading saved state<br>&gt;2005-07-01 16:37:46 Setting up<br>>network listener on 0.0.0.0:1984<br>&gt;2005-07-01 16:37:46 Cannot bind<br>>to listen socket (Address already in<br>&gt;use)<br>&gt;2005-07-01<br>>16:37:46 Task hobbitd terminated, status<br>>1<br>&gt;---snip---<br>&gt;<br>&gt;Looks like a clue.<br>&gt;I will add<br>>the output of netstat -a<br>&gt;<br>&gt;Got the hobbitd.log file for you<br>>too.<br>&gt;<br>&gt;Let me know if there is<br>>anything else I can get you.<br>&gt;<br>&gt;Regards<br>&gt;<br>>Vernon<br>&gt;<br>&gt;P.S. Your cold one is quickly becoming many cold<br>>ones if you ever get<br>>to<br>&gt;Perth<br>&gt;<br>&gt;<br>&gt;<br>&gt;<br>&gt;<br>&gt;-----Orig<br>>inal<br>>Message-----<br>&gt;From: Henrik Stoerner<br>>[mailto:user-ce4a2c883f75@xymon.invalid]<br>&gt;Sent: Friday, 1 July 2005 3:38<br>>PM<br>&gt;To:<br>>user-ae9b8668bcde@xymon.invalid<br>&gt;Subject: Re: [hobbit] Status<br>>Unavailable<br>&gt;<br>&gt;On Fri, Jul 01, 2005 at 03:25:30PM +0800,<br>>Vernon Everett wrote:<br>&gt; &gt; Thanks for helping on this.<br>&gt;<br>>&gt; I rebooted this morning. Could the memory leak still effect me in<br>>that<br>&gt;<br>&gt; &gt; short time?<br>&gt;<br>&gt;Probably not. Just<br>>wanted to rule out this possibility.<br>&gt;<br>&gt; &gt; No<br>>&quot;failed allocation&quot; in dmesg output.<br>&gt; &gt; Do you want<br>>the full output?<br>&gt;<br>&gt;No, I dont think that is<br>>necessary.<br>&gt;<br>&gt; &gt; [root at pengo log]# vmstat 4<br>>20<br>&gt;<br>&gt;And your system is mostly idle with no swap or disk<br>>activity.<br>&gt;<br>&gt; &gt; [hobbit at pengo hobbit]$ server/bin/bb<br>>127.0.0.1 &quot;hobbitdboard&quot;<br>&gt; &gt;<br>>2005-07-01 15:21:45 Whoops ! bb failed to send message -<br>>timeout<br>&gt;<br>&gt;Could you try running &quot;strace -p<br>>&lt;process-ID of the hobbitd process&gt;&quot;<br>&gt;for a minute or<br>>two and send me the output, then do a &quot;kill<br>>-6<br>&gt;&lt;process-id&gt;&quot; and mail me the core-file from<br>>~hobbit/server/tmp/<br>&gt;together with the ~hobbit/server/bin/hobbitd<br>>file ?<br>&gt;<br>&gt;Also, after this try adding a &quot;--debug&quot;<br>>to the hobbitd commandline in<br>&gt;hobbitlaunch.cfg.<br>>Let it run for a while and then mail me the<br>&gt;hobbitd.log<br>>file.<br>&gt;<br>&gt;This bug sounds a bit nasty, I think<br>>....<br>&gt;<br>&gt;<br>&gt;Regards,<br>&gt;Henrik<br>&gt;<br>&gt;<br>&g<br>>t;To<br>>unsubscribe from the hobbit list, send an e-mail<br>>to<br>&gt;user-095ef1c764a2@xymon.invalid<br>&gt;<br>&gt;<br>&gt;_ _ _ _ _ _<br>>_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _<br>>_<br>&gt;<br>&gt;NOTICE: This message and any attachments are<br>>confidential and may contain copyright material<br>&gt;of Australian<br>>Finance Group Limited or a third party. It is intended solely for the<br>>purpose of the<br>&gt;addressee and any other named recipient. If you<br>>are not the intended recipient, any use,<br>&gt;distribution, disclosure<br>>or copying of this message is strictly prohibited. The confidentiality<br>>attached<br>&gt;to this message is not waived or lost by reason of the<br>>mistaken transmission or delivery to any<br>&gt;unintended party. If you<br>>have received this message in error, please notify the author<br>>immediately or<br>&gt;contact Australian Finance Group on +61 8 9420<br>>7888.<br>&gt;<br>&gt;<br>&gt;To unsubscribe from the hobbit list, send<br>>an e-mail to<br>&gt;user-095ef1c764a2@xymon.invalid<br>&gt;<br>&gt;<br><br>><br>><br>><br>>To unsubscribe from the hobbit list, send an e-mail to<br>>user-095ef1c764a2@xymon.invalid<br>><br>><br>>_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _<br>><br>>NOTICE: This message and any attachments are confidential and may contain copyright material<br>>of Australian Finance Group Limited or a third party. It is intended solely for the purpose of the<br>>addressee and any other named recipient. If you are not the intended recipient, any use,<br>>distribution, disclosure or copying of this message is strictly prohibited. The confidentiality attached<br>>to this message is not waived or lost by reason of the mistaken transmission or delivery to any<br>>unintended party. If you have received this message in error, please notify the author immediately or<br>>contact Australian Finance Group on +61 8 9420 7888.<br>><br>><br>>To unsubscribe from the hobbit list, send an e-mail to<br>>user-095ef1c764a2@xymon.invalid<br>><br>><br>
list Vernon Everett
Never actually checked that. Nope, hobbitd refuses to die. You got any ideas?
▸
-----Original Message-----
From: Stefan Loos [mailto:user-dea24d965402@xymon.invalid] Sent: Monday, 4 July 2005 2:43 PM
To: user-ae9b8668bcde@xymon.invalid
Subject: RE: [hobbit] Status Unavailable
And can you stop the hobbit server with hobbit.sh or is one process
still running after that?
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
NOTICE: This message and any attachments are confidential and may contain copyright material of Australian Finance Group Limited or a third party. It is intended solely for the purpose of the addressee and any other named recipient. If you are not the intended recipient, any use, distribution, disclosure or copying of this message is strictly prohibited. The confidentiality attached
to this message is not waived or lost by reason of the mistaken transmission or delivery to any unintended party. If you have received this message in error, please notify the author immediately or contact Australian Finance Group on +61 8 9420 7888.
list Stefan Loos
Hello Vernon, I think you see the same problem that I reported to Hendrid in May (bbdisplay problems after adding some hosts). The only difference is that my hobbit server runs normal as long as no other hosts are added to bb-hosts. I wanted to find out if its a OS problem (we use RedHat Enterprise 3) and installed hobbit on a SuSE system. It shows the same problems.... Regards, Stefan <br><br><br>>From: "Vernon Everett" <user-99fc6b22a3a3@xymon.invalid><br>>Reply-To: user-ae9b8668bcde@xymon.invalid<br>>To: <user-ae9b8668bcde@xymon.invalid><br>>Subject: RE: [hobbit] Status Unavailable<br>>Date: Mon, 4 Jul 2005 14:48:12 +0800<br>><br>>Never actually checked that.<br>>Nope, hobbitd refuses to die.<br>><br>>You got any ideas?<br>><br>><br>><br>>-----Original Message-----<br>>From: Stefan Loos [mailto:user-dea24d965402@xymon.invalid]<br>>Sent: Monday, 4 July 2005 2:43 PM<br>>To: user-ae9b8668bcde@xymon.invalid<br>>Subject: RE: [hobbit] Status Unavailable<br>><br>>And can you stop the hobbit server with hobbit.sh or is one process<br>>still running after that?<br>><br>>_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _<br>><br>>NOTICE: This message and any attachments are confidential and may contain copyright material<br>>of Australian Finance Group Limited or a third party. It is intended solely for the purpose of the<br>>addressee and any other named recipient. If you are not the intended recipient, any use,<br>>distribution, disclosure or copying of this message is strictly prohibited. The confidentiality attached<br>>to this message is not waived or lost by reason of the mistaken transmission or delivery to any<br>>unintended party. If you have received this message in error, please notify the author immediately or<br>>contact Australian Finance Group on +61 8 9420 7888.<br>><br>><br>>To unsubscribe from the hobbit list, send an e-mail to<br>>user-095ef1c764a2@xymon.invalid<br>><br>><br>
list Vernon Everett
I think mine started after I removed a test, but I am not sure. It might have happened before.
▸
-----Original Message-----
From: Stefan Loos [mailto:user-dea24d965402@xymon.invalid]
Sent: Monday, 4 July 2005 3:04 PM
To: user-ae9b8668bcde@xymon.invalid
Subject: RE: [hobbit] Status Unavailable
Hello Vernon,
I think you see the same problem that I reported to Hendrid in May
(bbdisplay problems after adding some hosts).
The only difference is that my hobbit server runs normal as long as no
other hosts are added to bb-hosts.
I wanted to find out if its a OS problem (we use RedHat Enterprise 3)
and installed hobbit on a SuSE system. It shows the same problems....
Regards,
Stefan
<br><br><br>>From: "Vernon Everett"
<user-99fc6b22a3a3@xymon.invalid><br>>Reply-To:
user-ae9b8668bcde@xymon.invalid<br>>To: <user-ae9b8668bcde@xymon.invalid><br>>Subject: RE:
[hobbit] Status Unavailable<br>>Date: Mon, 4 Jul 2005 14:48:12
+0800<br>><br>>Never actually checked that.<br>>Nope, hobbitd
refuses to die.<br>><br>>You got any
ideas?<br>><br>><br>><br>>-----Original
Message-----<br>>From: Stefan Loos
[mailto:user-dea24d965402@xymon.invalid]<br>>Sent: Monday, 4 July 2005 2:43
PM<br>>To: user-ae9b8668bcde@xymon.invalid<br>>Subject: RE: [hobbit] Status
Unavailable<br>><br>>And can you stop the hobbit server with
hobbit.sh or is one process<br>>still running after
that?<br>><br>>_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
_ _ _ _ _ _ _ _ _ _ _
_<br>><br>>NOTICE: This message and any attachments are
confidential and may contain copyright material<br>>of Australian
Finance Group Limited or a third party. It is intended solely for the
purpose of the<br>>addressee and any other named recipient. If you
are not the intended recipient, any use,<br>>distribution, disclosure
or copying of this message is strictly prohibited. The confidentiality
attached<br>>to this message is not waived or lost by reason of the
mistaken transmission or delivery to any<br>>unintended party. If you
have received this message in error, please notify the author
immediately or<br>>contact Australian Finance Group on +61 8 9420
7888.<br>><br>><br>>To unsubscribe from the hobbit list, send
an e-mail to<br>>user-095ef1c764a2@xymon.invalid<br>><br>><br>
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
NOTICE: This message and any attachments are confidential and may contain copyright material
of Australian Finance Group Limited or a third party. It is intended solely for the purpose of the
addressee and any other named recipient. If you are not the intended recipient, any use,
distribution, disclosure or copying of this message is strictly prohibited. The confidentiality attached
to this message is not waived or lost by reason of the mistaken transmission or delivery to any
unintended party. If you have received this message in error, please notify the author immediately or
contact Australian Finance Group on +61 8 9420 7888.
list Vernon Everett
Hi Stefan Did you ever get your Hobbit up and running again?
▸
-----Original Message-----
From: Stefan Loos [mailto:user-dea24d965402@xymon.invalid] Sent: Monday, 4 July 2005 3:04 PM
To: user-ae9b8668bcde@xymon.invalid
Subject: RE: [hobbit] Status Unavailable
Hello Vernon,
I think you see the same problem that I reported to Hendrid in May
(bbdisplay problems after adding some hosts).
The only difference is that my hobbit server runs normal as long as no
other hosts are added to bb-hosts.
I wanted to find out if its a OS problem (we use RedHat Enterprise 3)
and installed hobbit on a SuSE system. It shows the same problems....
Regards,
Stefan
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
NOTICE: This message and any attachments are confidential and may contain copyright material of Australian Finance Group Limited or a third party. It is intended solely for the purpose of the addressee and any other named recipient. If you are not the intended recipient, any use, distribution, disclosure or copying of this message is strictly prohibited. The confidentiality attached
to this message is not waived or lost by reason of the mistaken transmission or delivery to any unintended party. If you have received this message in error, please notify the author immediately or contact Australian Finance Group on +61 8 9420 7888.
list Henrik Størner
Hi Vernon, could you try disabling the [larrdcolumn] and [infocolumn] tasks in your hobbitlaunch.cfg ? These are no longer used, and I suspect that the code in hobbitd that handles these messages is somewhat flaky. If hobbitd refuses to stop, you'll have to do it with a "kill -9". When you do that, please check that there are no ressources left allocated by hobbitd: login as the hobbit user run "ipcs -m" and "ipcs -s" If you see something like hobbit at osiris:~$ ipcs -m ------ Shared Memory Segments -------- key shmid owner perms bytes nattch status 0x01020b1f 482148354 hobbit 600 102400 2 0x02020b1f 482181123 hobbit 600 102400 2 0x03020b1f 482213892 hobbit 600 102400 2 0x04020b1f 482246661 hobbit 600 102400 2 0x05020b1f 482279430 hobbit 600 102400 1 0x06020b1f 482312199 hobbit 600 102400 1 hobbit at osiris:~$ ipcs -s ------ Semaphore Arrays -------- key semid owner perms nsems 0x01020b1f 3276801 hobbit 600 3 0x02020b1f 3309570 hobbit 600 3 0x03020b1f 3342339 hobbit 600 3 0x04020b1f 3375108 hobbit 600 3 0x05020b1f 3407877 hobbit 600 3 0x06020b1f 3440646 hobbit 600 3 then it hasn't cleaned up properly and you should either reboot the box or use "ipcrm -m <shmid>" and "ipcrm -s <semid>" to delete these. Regards, Henrik
list Vernon Everett
Nope! Hashed out the [larrdcolumn] and [infocolumn] sections. Had a look. Saw your shared mem segments. Didn't even look for semaphores. Decided a reboot is probably easier. Hobbit started on its own. It looked promising, and then after about 2 minutes, back to its usual crap. Green background. Header & footer only :-(
▸
-----Original Message-----
From: Henrik Stoerner [mailto:user-ce4a2c883f75@xymon.invalid]
Sent: Monday, 4 July 2005 4:10 PM
To: user-ae9b8668bcde@xymon.invalid
Subject: Re: [hobbit] Status Unavailable
Hi Vernon,
could you try disabling the [larrdcolumn] and [infocolumn] tasks in your
hobbitlaunch.cfg ? These are no longer used, and I suspect that the code
in hobbitd that handles these messages is somewhat flaky.
If hobbitd refuses to stop, you'll have to do it with a "kill -9". When
you do that, please check that there are no ressources left allocated by
hobbitd:
login as the hobbit user
run "ipcs -m" and "ipcs -s"
If you see something like
hobbit at osiris:~$ ipcs -m
------ Shared Memory Segments --------
key shmid owner perms bytes nattch
status
0x01020b1f 482148354 hobbit 600 102400 2
0x02020b1f 482181123 hobbit 600 102400 2
0x03020b1f 482213892 hobbit 600 102400 2
0x04020b1f 482246661 hobbit 600 102400 2
0x05020b1f 482279430 hobbit 600 102400 1
0x06020b1f 482312199 hobbit 600 102400 1
hobbit at osiris:~$ ipcs -s
------ Semaphore Arrays --------
key semid owner perms nsems
0x01020b1f 3276801 hobbit 600 3
0x02020b1f 3309570 hobbit 600 3
0x03020b1f 3342339 hobbit 600 3
0x04020b1f 3375108 hobbit 600 3
0x05020b1f 3407877 hobbit 600 3
0x06020b1f 3440646 hobbit 600 3
then it hasn't cleaned up properly and you should either reboot the box
or use "ipcrm -m <shmid>" and "ipcrm -s <semid>" to delete these.
Regards,
Henrik
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
NOTICE: This message and any attachments are confidential and may contain copyright material
of Australian Finance Group Limited or a third party. It is intended solely for the purpose of the
addressee and any other named recipient. If you are not the intended recipient, any use,
distribution, disclosure or copying of this message is strictly prohibited. The confidentiality attached
to this message is not waived or lost by reason of the mistaken transmission or delivery to any
unintended party. If you have received this message in error, please notify the author immediately or
contact Australian Finance Group on +61 8 9420 7888.
list Stefan Loos
Hi Vernon, if I disable sending messages to the hobbit-server (block port 1984) then everything seems to run. But that doesn't make sense with a monitoring server! <br><br><br>>From: "Vernon Everett" <user-99fc6b22a3a3@xymon.invalid><br>>Reply-To: user-ae9b8668bcde@xymon.invalid<br>>To: <user-ae9b8668bcde@xymon.invalid><br>>Subject: RE: [hobbit] Status Unavailable<br>>Date: Mon, 4 Jul 2005 16:00:41 +0800<br>><br>>Hi Stefan Did you ever get your Hobbit up and running again?<br>><br>>-----Original Message-----<br>>From: Stefan Loos [mailto:user-dea24d965402@xymon.invalid]<br>>Sent: Monday, 4 July 2005 3:04 PM<br>>To: user-ae9b8668bcde@xymon.invalid<br>>Subject: RE: [hobbit] Status Unavailable<br>><br>>Hello Vernon,<br>><br>>I think you see the same problem that I reported to Hendrid in May<br>>(bbdisplay problems after adding some hosts).<br>>The only difference is that my hobbit server runs normal as long as no<br>>other hosts are added to bb-hosts.<br>>I wanted to find out if its a OS problem (we use RedHat Enterprise 3)<br>>and installed hobbit on a SuSE system. It shows the same problems....<br>><br>>Regards,<br>><br>>Stefan<br>>_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _<br>><br>>NOTICE: This message and any attachments are confidential and may contain copyright material<br>>of Australian Finance Group Limited or a third party. It is intended solely for the purpose of the<br>>addressee and any other named recipient. If you are not the intended recipient, any use,<br>>distribution, disclosure or copying of this message is strictly prohibited. The confidentiality attached<br>>to this message is not waived or lost by reason of the mistaken transmission or delivery to any<br>>unintended party. If you have received this message in error, please notify the author immediately or<br>>contact Australian Finance Group on +61 8 9420 7888.<br>><br>><br>>To unsubscribe from the hobbit list, send an e-mail to<br>>user-095ef1c764a2@xymon.invalid<br>><br>><br>
list Stefan Loos
You're running the hobbit server without the HEARTBEAT option - right? <br><br><br>>From: "Vernon Everett" <user-99fc6b22a3a3@xymon.invalid><br>>Reply-To: user-ae9b8668bcde@xymon.invalid<br>>To: <user-ae9b8668bcde@xymon.invalid><br>>Subject: RE: [hobbit] Status Unavailable<br>>Date: Mon, 4 Jul 2005 16:30:48 +0800<br>><br>>Nope!<br>>Hashed out the [larrdcolumn] and [infocolumn] sections.<br>>Had a look. Saw your shared mem segments.<br>>Didn't even look for semaphores.<br>>Decided a reboot is probably easier.<br>>Hobbit started on its own.<br>>It looked promising, and then after about 2 minutes, back to its usual<br>>crap.<br>><br>>Green background. Header & footer only :-(<br>><br>><br>><br>>-----Original Message-----<br>>From: Henrik Stoerner [mailto:user-ce4a2c883f75@xymon.invalid]<br>>Sent: Monday, 4 July 2005 4:10 PM<br>>To: user-ae9b8668bcde@xymon.invalid<br>>Subject: Re: [hobbit] Status Unavailable<br>><br>>Hi Vernon,<br>><br>>could you try disabling the [larrdcolumn] and [infocolumn] tasks in your<br>>hobbitlaunch.cfg ? These are no longer used, and I suspect that the code<br>>in hobbitd that handles these messages is somewhat flaky.<br>><br>>If hobbitd refuses to stop, you'll have to do it with a "kill -9". When<br>>you do that, please check that there are no ressources left allocated by<br>>hobbitd:<br>><br>> login as the hobbit user<br>> run "ipcs -m" and "ipcs -s"<br>><br>>If you see something like<br>><br>> hobbit at osiris:~$ ipcs -m<br>><br>> ------ Shared Memory Segments --------<br>> key shmid owner perms bytes nattch<br>>status<br>> 0x01020b1f 482148354 hobbit 600 102400 2<br>> 0x02020b1f 482181123 hobbit 600 102400 2<br>> 0x03020b1f 482213892 hobbit 600 102400 2<br>> 0x04020b1f 482246661 hobbit 600 102400 2<br>> 0x05020b1f 482279430 hobbit 600 102400 1<br>> 0x06020b1f 482312199 hobbit 600 102400 1<br>><br>> hobbit at osiris:~$ ipcs -s<br>><br>> ------ Semaphore Arrays --------<br>> key semid owner perms nsems<br>> 0x01020b1f 3276801 hobbit 600 3<br>> 0x02020b1f 3309570 hobbit 600 3<br>> 0x03020b1f 3342339 hobbit 600 3<br>> 0x04020b1f 3375108 hobbit 600 3<br>> 0x05020b1f 3407877 hobbit 600 3<br>> 0x06020b1f 3440646 hobbit 600 3<br>><br>>then it hasn't cleaned up properly and you should either reboot the box<br>>or use "ipcrm -m <shmid>" and "ipcrm -s <semid>" to delete these.<br>><br>><br>>Regards,<br>>Henrik<br>><br>><br>>To unsubscribe from the hobbit list, send an e-mail to<br>>user-095ef1c764a2@xymon.invalid<br>><br>><br>>_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _<br>><br>>NOTICE: This message and any attachments are confidential and may contain copyright material<br>>of Australian Finance Group Limited or a third party. It is intended solely for the purpose of the<br>>addressee and any other named recipient. If you are not the intended recipient, any use,<br>>distribution, disclosure or copying of this message is strictly prohibited. The confidentiality attached<br>>to this message is not waived or lost by reason of the mistaken transmission or delivery to any<br>>unintended party. If you have received this message in error, please notify the author immediately or<br>>contact Australian Finance Group on +61 8 9420 7888.<br>><br>><br>>To unsubscribe from the hobbit list, send an e-mail to<br>>user-095ef1c764a2@xymon.invalid<br>><br>><br>
list Vernon Everett
Yes. That's the first thing Henrik asked me to disable,.
▸
-----Original Message-----
From: Stefan Loos [mailto:user-dea24d965402@xymon.invalid] Sent: Monday, 4 July 2005 4:36 PM
To: user-ae9b8668bcde@xymon.invalid
Subject: RE: [hobbit] Status Unavailable
You're running the hobbit server without the HEARTBEAT option - right?
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
NOTICE: This message and any attachments are confidential and may contain copyright material of Australian Finance Group Limited or a third party. It is intended solely for the purpose of the addressee and any other named recipient. If you are not the intended recipient, any use, distribution, disclosure or copying of this message is strictly prohibited. The confidentiality attached
to this message is not waived or lost by reason of the mistaken transmission or delivery to any unintended party. If you have received this message in error, please notify the author immediately or contact Australian Finance Group on +61 8 9420 7888.
list Henrik Størner
▸
On Mon, Jul 04, 2005 at 07:04:02AM +0000, Stefan Loos wrote:
I think you see the same problem that I reported to Hendrid in May (bbdisplay problems after adding some hosts). The only difference is that my hobbit server runs normal as long as no other hosts are added to bb-hosts. I wanted to find out if its a OS problem (we use RedHat Enterprise 3) and installed hobbit on a SuSE system. It shows the same problems....
Well, I've found *one* bug that might explain this - but I do not yet know if it's the one that bites Vernon. It's triggered by an invalid status message that doesn't include any column-name. Stefan - you might want to try out the current snapshot available at http://www.hswn.dk/beta/hobbit-snapshot.tar.gz You build it the normal way (configure; make) - but instead of running "make install" just copy the hobbitd/hobbitd binary to your ~hobbit/server/bin/ After restarting Hobbit you should see some "Bogus status message" entries in the hobbitd.log file if this problem occurs, but it should no longer crash. Regards, Henrik
list Stefan Loos
Hello Henrik, I installed the snapshot as you described - no errors so far. Regards, Stefan <br><br><br>>From: user-ce4a2c883f75@xymon.invalid (Henrik Stoerner)<br>>Reply-To: user-ae9b8668bcde@xymon.invalid<br>>To: user-ae9b8668bcde@xymon.invalid<br>>Subject: Re: [hobbit] Status Unavailable<br>>Date: Mon, 4 Jul 2005 15:36:37 +0200<br>><br>>On Mon, Jul 04, 2005 at 07:04:02AM +0000, Stefan Loos wrote:<br>> > I think you see the same problem that I reported to Hendrid in May<br>> > (bbdisplay problems after adding some hosts).<br>> > The only difference is that my hobbit server runs normal as long as no<br>> > other hosts are added to bb-hosts.<br>> > I wanted to find out if its a OS problem (we use RedHat Enterprise 3) and<br>> > installed hobbit on a SuSE system. It shows the same problems....<br>><br>>Well, I've found *one* bug that might explain this - but I do not yet<br>>know if it's the one that bites Vernon.<br>><br>>It's triggered by an invalid status message that doesn't include any<br>>column-name.<br>><br>>Stefan - you might want to try out the current snapshot available<br>>at http://www.hswn.dk/beta/hobbit-snapshot.tar.gz<br>><br>>You build it the normal way (configure; make) - but instead of<br>>running "make install" just copy the hobbitd/hobbitd binary to<br>>your ~hobbit/server/bin/<br>><br>>After restarting Hobbit you should see some "Bogus status message"<br>>entries in the hobbitd.log file if this problem occurs, but it<br>>should no longer crash.<br>><br>><br>>Regards,<br>>Henrik<br>><br>><br>>To unsubscribe from the hobbit list, send an e-mail to<br>>user-095ef1c764a2@xymon.invalid<br>><br>><br>
list Stefan Loos
Hello Henrik, the hobbit-server is still running without problems! And there are no "status board not available" in the bb-display.log. The hobbitd.log shows only the "Setup complete" message since the last restart. I wait until tomorow and reinstall RedHat on that server (I only have one for testing).
▸
Regards, Stefan <br><br><br>>From: user-ce4a2c883f75@xymon.invalid (Henrik Stoerner)<br>>Reply-To: user-ae9b8668bcde@xymon.invalid<br>>To: user-ae9b8668bcde@xymon.invalid<br>>Subject: Re: [hobbit] Status Unavailable<br>>Date: Mon, 4 Jul 2005 15:36:37 +0200<br>><br>>On Mon, Jul 04, 2005 at 07:04:02AM +0000, Stefan Loos wrote:<br>> > I think you see the same problem that I reported to Hendrid in May<br>> > (bbdisplay problems after adding some hosts).<br>> > The only difference is that my hobbit server runs normal as long as no<br>> > other hosts are added to bb-hosts.<br>> > I wanted to find out if its a OS problem (we use RedHat Enterprise 3) and<br>> > installed hobbit on a SuSE system. It shows the same problems....<br>><br>>Well, I've found *one* bug that might explain this - but I do not yet<br>>know if it's the one that bites Vernon.<br>><br>>It's triggered by an invalid status message that doesn't include any<br>>column-name.<br>><br>>Stefan - you might want to try out the current snapshot available<br>>at http://www.hswn.dk/beta/hobbit-snapshot.tar.gz<br>><br>>You build it the normal way (configure; make) - but instead of<br>>running "make install" just copy the hobbitd/hobbitd binary to<br>>your ~hobbit/server/bin/<br>><br>>After restarting Hobbit you should see some "Bogus status message"<br>>entries in the hobbitd.log file if this problem occurs, but it<br>>should no longer crash.<br>><br>><br>>Regards,<br>>Henrik<br>><br>><br>>To unsubscribe from the hobbit list, send an e-mail to<br>>user-095ef1c764a2@xymon.invalid<br>><br>><br>
list Henrik Størner
▸
On Tue, Jul 05, 2005 at 12:22:58PM +0000, Stefan Loos wrote:
the hobbit-server is still running without problems! And there are no "status board not available" in the bb-display.log. The hobbitd.log shows only the "Setup complete" message since the last restart.
And you have added the problematic hosts that used to trigger this problem ? Regards, Henrik
list Stefan Loos
Let me say it in other words: I didn't drop a host from bb-hosts. So if there was any problematic host then it's still there. Regards, Stefan
▸
From: user-ce4a2c883f75@xymon.invalid (Henrik Stoerner) Reply-To: user-ae9b8668bcde@xymon.invalid To: user-ae9b8668bcde@xymon.invalid Subject: Re: [hobbit] Status Unavailable Date: Tue, 5 Jul 2005 14:33:04 +0200 On Tue, Jul 05, 2005 at 12:22:58PM +0000, Stefan Loos wrote:the hobbit-server is still running without problems! And there are no "status board not available" in the bb-display.log. The hobbitd.log shows only the "Setup complete" message since the last restart.And you have added the problematic hosts that used to trigger this problem ? Regards, Henrik
list Stefan Loos
Hello Henrik, after the server runs for a few days the error came back (even with the hobbitd of the snapshot-version you gave me). It would be interesting to hear from Vernon if his server still has this problem.... Regards, Stefan
list Vernon Everett
I am still here, and still having major problems :-( I thought we had isolated the problem to messages from a particular monitored client, but after a few hours of smooth running, with messages from that client disabled, it failed again. Regards Vernon
▸
-----Original Message-----
From: Stefan Loos [mailto:user-dea24d965402@xymon.invalid] Sent: Monday, 11 July 2005 3:49 PM
To: user-ae9b8668bcde@xymon.invalid
Subject: [hobbit] Status Unavailable - again
Hello Henrik,
after the server runs for a few days the error came back (even with the hobbitd of the snapshot-version you gave me).
It would be interesting to hear from Vernon if his server still has this
problem....
Regards,
Stefan
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
NOTICE: This message and any attachments are confidential and may contain copyright material of Australian Finance Group Limited or a third party. It is intended solely for the purpose of the addressee and any other named recipient. If you are not the intended recipient, any use, distribution, disclosure or copying of this message is strictly prohibited. The confidentiality attached
to this message is not waived or lost by reason of the mistaken transmission or delivery to any unintended party. If you have received this message in error, please notify the author immediately or contact Australian Finance Group on +61 8 9420 7888.
list Stefan Loos
did you use the hobbitd from the snapshot? <br><br><br>>From: "Vernon Everett" <user-99fc6b22a3a3@xymon.invalid><br>>Reply-To: user-ae9b8668bcde@xymon.invalid<br>>To: <user-ae9b8668bcde@xymon.invalid><br>>Subject: RE: [hobbit] Status Unavailable - again<br>>Date: Mon, 11 Jul 2005 15:56:49 +0800<br>><br>>I am still here, and still having major problems :-(<br>>I thought we had isolated the problem to messages from a particular<br>>monitored client, but after a few hours of smooth running, with messages<br>>from that client disabled, it failed again.<br>><br>>Regards<br>> Vernon<br>><br>>-----Original Message-----<br>>From: Stefan Loos [mailto:user-dea24d965402@xymon.invalid]<br>>Sent: Monday, 11 July 2005 3:49 PM<br>>To: user-ae9b8668bcde@xymon.invalid<br>>Subject: [hobbit] Status Unavailable - again<br>><br>>Hello Henrik,<br>><br>>after the server runs for a few days the error came back (even with the<br>>hobbitd of the snapshot-version you gave me).<br>><br>>It would be interesting to hear from Vernon if his server still has this<br>><br>>problem....<br>><br>>Regards,<br>><br>>Stefan<br>><br>><br>><br>>To unsubscribe from the hobbit list, send an e-mail to<br>>user-095ef1c764a2@xymon.invalid<br>><br>><br>>_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _<br>><br>>NOTICE: This message and any attachments are confidential and may contain copyright material<br>>of Australian Finance Group Limited or a third party. It is intended solely for the purpose of the<br>>addressee and any other named recipient. If you are not the intended recipient, any use,<br>>distribution, disclosure or copying of this message is strictly prohibited. The confidentiality attached<br>>to this message is not waived or lost by reason of the mistaken transmission or delivery to any<br>>unintended party. If you have received this message in error, please notify the author immediately or<br>>contact Australian Finance Group on +61 8 9420 7888.<br>><br>><br>>To unsubscribe from the hobbit list, send an e-mail to<br>>user-095ef1c764a2@xymon.invalid<br>><br>><br>
list Stefan Loos
did you use the hobbitd from the snapshot? <br><br><br>>From: "Vernon Everett" <user-99fc6b22a3a3@xymon.invalid><br>>Reply-To: user-ae9b8668bcde@xymon.invalid<br>>To: <user-ae9b8668bcde@xymon.invalid><br>>Subject: RE: [hobbit] Status Unavailable - again<br>>Date: Mon, 11 Jul 2005 15:56:49 +0800<br>><br>>I am still here, and still having major problems :-(<br>>I thought we had isolated the problem to messages from a particular<br>>monitored client, but after a few hours of smooth running, with messages<br>>from that client disabled, it failed again.<br>><br>>Regards<br>> Vernon<br>><br>>-----Original Message-----<br>>From: Stefan Loos [mailto:user-dea24d965402@xymon.invalid]<br>>Sent: Monday, 11 July 2005 3:49 PM<br>>To: user-ae9b8668bcde@xymon.invalid<br>>Subject: [hobbit] Status Unavailable - again<br>><br>>Hello Henrik,<br>><br>>after the server runs for a few days the error came back (even with the<br>>hobbitd of the snapshot-version you gave me).<br>><br>>It would be interesting to hear from Vernon if his server still has this<br>><br>>problem....<br>><br>>Regards,<br>><br>>Stefan<br>><br>><br>><br>>To unsubscribe from the hobbit list, send an e-mail to<br>>user-095ef1c764a2@xymon.invalid<br>><br>><br>>_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _<br>><br>>NOTICE: This message and any attachments are confidential and may contain copyright material<br>>of Australian Finance Group Limited or a third party. It is intended solely for the purpose of the<br>>addressee and any other named recipient. If you are not the intended recipient, any use,<br>>distribution, disclosure or copying of this message is strictly prohibited. The confidentiality attached<br>>to this message is not waived or lost by reason of the mistaken transmission or delivery to any<br>>unintended party. If you have received this message in error, please notify the author immediately or<br>>contact Australian Finance Group on +61 8 9420 7888.<br>><br>><br>>To unsubscribe from the hobbit list, send an e-mail to<br>>user-095ef1c764a2@xymon.invalid<br>><br>><br>
list Vernon Everett
Yep. I did. Between myself and Henrik, we have tried a number of versions, and a few special diagnostic versions of Hobbit. We have been working mostly off-list because we have been exchanging potentially confidential information, and don't believe that our failure to diagnose a problem is of general interest. (I am sure a full account of this sad tale will be posted once it is resolved.) Kudos to Henrik though. I think he has really tried. He has worked tirelessly to try and resolve this issue, which I believe is very admirable, when you consider his reward for all this hard work. Henrik is a true mensch. (Yes, that's an English word. Look it up) So far, we have not been able to identify the root cause of the problem. Henrik was going to have a look at some of the messages that came from one of the hosts, and get back to me. Stefan, if you are interested in assisting us with this, then myself and Henrik can cc you in our off-list exchanges. Have you got any theories as to the cause?
▸
Regards
Vernon
-----Original Message-----
From: Stefan Loos [mailto:user-dea24d965402@xymon.invalid]
Sent: Monday, 11 July 2005 4:21 PM
To: user-ae9b8668bcde@xymon.invalid
Subject: RE: [hobbit] Status Unavailable - again
did you use the hobbitd from the snapshot?
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
NOTICE: This message and any attachments are confidential and may contain copyright material
of Australian Finance Group Limited or a third party. It is intended solely for the purpose of the
addressee and any other named recipient. If you are not the intended recipient, any use,
distribution, disclosure or copying of this message is strictly prohibited. The confidentiality attached
to this message is not waived or lost by reason of the mistaken transmission or delivery to any
unintended party. If you have received this message in error, please notify the author immediately or
contact Australian Finance Group on +61 8 9420 7888.
list Stefan Loos
It would be great if you can put me in cc. If you want I can try to assist you. I'm at a point where I don't know what to try anymore. I think it isn't easy for Henrik to find this issue - I have no coredumps and nothing in the logfile what could help. And I never had any doubt that Henrik does a great job! (I think my English is not good enough to say it in other words) So if there is anything what I can do to solve this problem.... I've tried to lookup "mensch" but I think I'm using the wrong sites - german english dictionaries always recognize mensch as a german word ;-) Regards, Stefan <br><br><br>>From: "Vernon Everett" <user-99fc6b22a3a3@xymon.invalid><br>>Reply-To: user-ae9b8668bcde@xymon.invalid<br>>To: <user-ae9b8668bcde@xymon.invalid><br>>Subject: RE: [hobbit] Status Unavailable - again<br>>Date: Mon, 11 Jul 2005 16:37:28 +0800<br>><br>>Yep.<br>>I did.<br>>Between myself and Henrik, we have tried a number of versions, and a few<br>>special diagnostic versions of Hobbit.<br>>We have been working mostly off-list because we have been exchanging<br>>potentially confidential information, and don't believe that our failure<br>>to diagnose a problem is of general interest. (I am sure a full account<br>>of this sad tale will be posted once it is resolved.)<br>><br>>Kudos to Henrik though.<br>>I think he has really tried. He has worked tirelessly to try and resolve<br>>this issue, which I believe is very admirable, when you consider his<br>>reward for all this hard work.<br>>Henrik is a true mensch. (Yes, that's an English word. Look it up)<br>><br>>So far, we have not been able to identify the root cause of the problem.<br>>Henrik was going to have a look at some of the messages that came from<br>>one of the hosts, and get back to me.<br>><br>>Stefan, if you are interested in assisting us with this, then myself and<br>>Henrik can cc you in our off-list exchanges.<br>><br>>Have you got any theories as to the cause?<br>><br>>Regards<br>> Vernon<br>><br>><br>><br>>-----Original Message-----<br>>From: Stefan Loos [mailto:user-dea24d965402@xymon.invalid]<br>>Sent: Monday, 11 July 2005 4:21 PM<br>>To: user-ae9b8668bcde@xymon.invalid<br>>Subject: RE: [hobbit] Status Unavailable - again<br>><br>>did you use the hobbitd from the snapshot?<br>><br>><br>>_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _<br>><br>>NOTICE: This message and any attachments are confidential and may contain copyright material<br>>of Australian Finance Group Limited or a third party. It is intended solely for the purpose of the<br>>addressee and any other named recipient. If you are not the intended recipient, any use,<br>>distribution, disclosure or copying of this message is strictly prohibited. The confidentiality attached<br>>to this message is not waived or lost by reason of the mistaken transmission or delivery to any<br>>unintended party. If you have received this message in error, please notify the author immediately or<br>>contact Australian Finance Group on +61 8 9420 7888.<br>><br>><br>>To unsubscribe from the hobbit list, send an e-mail to<br>>user-095ef1c764a2@xymon.invalid<br>><br>><br>
list Vernon Everett
Try http://www.onelook.com/?w=mensch&ls=a :-) Once Hobbit hangs, it refuses to core. The only thing that will clobber it, is a kill -9 When it hangs, it hangs good. The logs are devoid of useful info too. Even with the latest build. This is probably the most frustrating issue I have ever seen.
▸
Regards
Vernon
-----Original Message-----
From: Stefan Loos [mailto:user-dea24d965402@xymon.invalid] Sent: Monday, 11 July 2005 5:37 PM
To: user-ae9b8668bcde@xymon.invalid
Subject: RE: [hobbit] Status Unavailable - again
It would be great if you can put me in cc. If you want I can try to
assist you. I'm at a point where I don't know what to try anymore. I think it
isn't easy for Henrik to find this issue - I have no coredumps and nothing in
the logfile what could help.
And I never had any doubt that Henrik does a great job! (I think my
English is not good enough to say it in other words)
So if there is anything what I can do to solve this problem....
I've tried to lookup "mensch" but I think I'm using the wrong sites -
german english dictionaries always recognize mensch as a german word ;-)
Regards,
Stefan
<br><br><br>>From: "Vernon Everett" <user-99fc6b22a3a3@xymon.invalid><br>>Reply-To: user-ae9b8668bcde@xymon.invalid<br>>To: <user-ae9b8668bcde@xymon.invalid><br>>Subject: RE: [hobbit] Status Unavailable - again<br>>Date: Mon, 11 Jul 2005
16:37:28 +0800<br>><br>>Yep.<br>>I did.<br>>Between myself and
Henrik, we have tried a number of versions, and a few<br>>special diagnostic versions of Hobbit.<br>>We have been working mostly off-list because
we have been exchanging<br>>potentially confidential information, and
don't believe that our failure<br>>to diagnose a problem is of general interest. (I am sure a full account<br>>of this sad tale will be
posted once it is resolved.)<br>><br>>Kudos to Henrik though.<br>>I
think he has really tried. He has worked tirelessly to try and
resolve<br>>this issue, which I believe is very admirable, when you consider his<br>>reward for all this hard work.<br>>Henrik is a true
mensch. (Yes, that's an English word. Look it up)<br>><br>>So far, we have
not been able to identify the root cause of the problem.<br>>Henrik was
going to have a look at some of the messages that came from<br>>one of the hosts, and get back to me.<br>><br>>Stefan, if you are interested
in assisting us with this, then myself and<br>>Henrik can cc you in our off-list exchanges.<br>><br>>Have you got any theories as to the cause?<br>><br>>Regards<br>> Vernon<br>><br>><br>><br>>-----Original Message-----<br>>From: Stefan Loos [mailto:user-dea24d965402@xymon.invalid]<br>>Sent: Monday, 11 July 2005 4:21 PM<br>>To: user-ae9b8668bcde@xymon.invalid<br>>Subject: RE: [hobbit] Status
Unavailable - again<br>><br>>did you use the hobbitd from the snapshot?<br>><br>><br>>_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _<br>><br>>NOTICE: This message and
any attachments are confidential and may contain copyright
material<br>>of Australian Finance Group Limited or a third party. It is intended solely
for the purpose of the<br>>addressee and any other named recipient. If
you are not the intended recipient, any use,<br>>distribution, disclosure
or copying of this message is strictly prohibited. The confidentiality attached<br>>to this message is not waived or lost by reason of the mistaken transmission or delivery to any<br>>unintended party. If you
have received this message in error, please notify the author
immediately or<br>>contact Australian Finance Group on +61 8 9420 7888.<br>><br>><br>>To unsubscribe from the hobbit list, send
an e-mail to<br>>user-095ef1c764a2@xymon.invalid<br>><br>><br>
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
NOTICE: This message and any attachments are confidential and may contain copyright material of Australian Finance Group Limited or a third party. It is intended solely for the purpose of the addressee and any other named recipient. If you are not the intended recipient, any use, distribution, disclosure or copying of this message is strictly prohibited. The confidentiality attached
to this message is not waived or lost by reason of the mistaken transmission or delivery to any unintended party. If you have received this message in error, please notify the author immediately or contact Australian Finance Group on +61 8 9420 7888.
list Stefan Loos
Funny thing - your link brought me to http://germanenglishwords.com. Are you really using this words in english? I always thought that kindergarten and sauerkraut are the only few words which are used in english spoken countries.... Yes its really frustrating! So many others here in this mailing list don't have that problem. Are you using subpages in bb-hosts? Do you have many own-written monitoring scripts? Regards, Stefan <br><br><br>>From: "Vernon Everett" <user-99fc6b22a3a3@xymon.invalid><br>>Reply-To: user-ae9b8668bcde@xymon.invalid<br>>To: <user-ae9b8668bcde@xymon.invalid><br>>Subject: RE: [hobbit] Status Unavailable - again<br>>Date: Mon, 11 Jul 2005 18:01:45 +0800<br>><br>>Try http://www.onelook.com/?w=mensch&ls=a<br>>:-)<br>><br>>Once Hobbit hangs, it refuses to core.<br>>The only thing that will clobber it, is a kill -9<br>>When it hangs, it hangs good.<br>>The logs are devoid of useful info too. Even with the latest build.<br>><br>>This is probably the most frustrating issue I have ever seen.<br>><br>>Regards<br>> Vernon<br>><br>>-----Original Message-----<br>>From: Stefan Loos [mailto:user-dea24d965402@xymon.invalid]<br>>Sent: Monday, 11 July 2005 5:37 PM<br>>To: user-ae9b8668bcde@xymon.invalid<br>>Subject: RE: [hobbit] Status Unavailable - again<br>><br>>It would be great if you can put me in cc. If you want I can try to<br>>assist<br>>you. I'm at a point where I don't know what to try anymore. I think it<br>>isn't<br>>easy for Henrik to find this issue - I have no coredumps and nothing in<br>>the<br>>logfile what could help.<br>>And I never had any doubt that Henrik does a great job! (I think my<br>>English<br>>is not good enough to say it in other words)<br>>So if there is anything what I can do to solve this problem....<br>><br>>I've tried to lookup "mensch" but I think I'm using the wrong sites -<br>>german<br>>english dictionaries always recognize mensch as a german word ;-)<br>><br>>Regards,<br>>Stefan<br>><br>><br><br><br>&gt;From: &quot;Vernon Everett&quot;<br>>&lt;user-99fc6b22a3a3@xymon.invalid&gt;<br>&gt;Reply-To:<br>>user-ae9b8668bcde@xymon.invalid<br>&gt;To: &lt;user-ae9b8668bcde@xymon.invalid&gt;<br>&gt;Subject: RE:<br>>[hobbit] Status Unavailable - again<br>&gt;Date: Mon, 11 Jul 2005<br>>16:37:28<br>>+0800<br>&gt;<br>&gt;Yep.<br>&gt;I did.<br>&gt;Between myself and<br>>Henrik, we<br>>have tried a number of versions, and a few<br>&gt;special diagnostic<br>>versions of Hobbit.<br>&gt;We have been working mostly off-list because<br>>we<br>>have been exchanging<br>&gt;potentially confidential information, and<br>>don't<br>>believe that our failure<br>&gt;to diagnose a problem is of general<br>>interest. (I am sure a full account<br>&gt;of this sad tale will be<br>>posted<br>>once it is resolved.)<br>&gt;<br>&gt;Kudos to Henrik though.<br>&gt;I<br>>think<br>>he has really tried. He has worked tirelessly to try and<br>>resolve<br>&gt;this<br>>issue, which I believe is very admirable, when you consider<br>>his<br>&gt;reward for all this hard work.<br>&gt;Henrik is a true<br>>mensch.<br>>(Yes, that's an English word. Look it up)<br>&gt;<br>&gt;So far, we have<br>>not<br>>been able to identify the root cause of the problem.<br>&gt;Henrik was<br>>going<br>>to have a look at some of the messages that came from<br>&gt;one of the<br>>hosts, and get back to me.<br>&gt;<br>&gt;Stefan, if you are interested<br>>in<br>>assisting us with this, then myself and<br>&gt;Henrik can cc you in our<br>>off-list exchanges.<br>&gt;<br>&gt;Have you got any theories as to the<br>>cause?<br>&gt;<br>&gt;Regards<br>&gt;<br>>Vernon<br>&gt;<br>&gt;<br>&gt;<br>&gt;-----Original<br>>Message-----<br>&gt;From: Stefan Loos<br>>[mailto:user-dea24d965402@xymon.invalid]<br>&gt;Sent: Monday, 11 July 2005 4:21<br>>PM<br>&gt;To: user-ae9b8668bcde@xymon.invalid<br>&gt;Subject: RE: [hobbit] Status<br>>Unavailable<br>>- again<br>&gt;<br>&gt;did you use the hobbitd from the<br>>snapshot?<br>&gt;<br>&gt;<br>&gt;_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _<br>>_ _<br>>_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _<br>&gt;<br>&gt;NOTICE: This message and<br>>any<br>>attachments are confidential and may contain copyright<br>>material<br>&gt;of<br>>Australian Finance Group Limited or a third party. It is intended solely<br>>for<br>>the purpose of the<br>&gt;addressee and any other named recipient. If<br>>you<br>>are not the intended recipient, any use,<br>&gt;distribution, disclosure<br>>or<br>>copying of this message is strictly prohibited. The confidentiality<br>>attached<br>&gt;to this message is not waived or lost by reason of the<br>>mistaken transmission or delivery to any<br>&gt;unintended party. If you<br>><br>>have received this message in error, please notify the author<br>>immediately<br>>or<br>&gt;contact Australian Finance Group on +61 8 9420<br>>7888.<br>&gt;<br>&gt;<br>&gt;To unsubscribe from the hobbit list, send<br>>an<br>>e-mail to<br>&gt;user-095ef1c764a2@xymon.invalid<br>&gt;<br>&gt;<br><br>><br>><br>><br>>To unsubscribe from the hobbit list, send an e-mail to<br>>user-095ef1c764a2@xymon.invalid<br>><br>><br>>_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _<br>><br>>NOTICE: This message and any attachments are confidential and may contain copyright material<br>>of Australian Finance Group Limited or a third party. It is intended solely for the purpose of the<br>>addressee and any other named recipient. If you are not the intended recipient, any use,<br>>distribution, disclosure or copying of this message is strictly prohibited. The confidentiality attached<br>>to this message is not waived or lost by reason of the mistaken transmission or delivery to any<br>>unintended party. If you have received this message in error, please notify the author immediately or<br>>contact Australian Finance Group on +61 8 9420 7888.<br>><br>><br>>To unsubscribe from the hobbit list, send an e-mail to<br>>user-095ef1c764a2@xymon.invalid<br>><br>><br>
list Henrik Størner
Hi Stefan,
▸
On Mon, Jul 11, 2005 at 09:36:42AM +0000, Stefan Loos wrote:It would be great if you can put me in cc. If you want I can try to assist you. I'm at a point where I don't know what to try anymore. I think it isn't easy for Henrik to find this issue - I have no coredumps and nothing in the logfile what could help.
Yes, this is a really nasty problem. Vernon and I though we had it nailed down by the end of last week, but there's more to it than what we found then. What kind of external scripts run on your clients, apart from the BB client ? The current suspicion is that this is triggered by a status message that is handled badly by Hobbit causing this lock-up. So I'm trying to see if there might be something in common between your setups. And what kind of system are you running Hobbit on ? If Linux, which distribution ? Another suspicion I have is that this might be a problem with the implementation of SysV IPC semaphores. Regards, Henrik
list Stefan Loos
Hi Henrik, we have several own-written scripts (mostly in perl) which monitor oracle instances, bea weblogic servers. There is one for hardware monitoring - HP (Intel) and Sun Servers (prtdiag based) and some are just the output of a http request to the software running on that weblogic servers. I have just one server for testing, it's a HP DL 360. We are running Redhat Enterprise Server 3 but I've tried it with a SuSE 9.3 too. Regards, Stefan <br><br><br>>From: user-ce4a2c883f75@xymon.invalid (Henrik Stoerner)<br>>Reply-To: user-ae9b8668bcde@xymon.invalid<br>>To: user-ae9b8668bcde@xymon.invalid<br>>Subject: Re: [hobbit] Status Unavailable - again<br>>Date: Mon, 11 Jul 2005 13:08:59 +0200<br>><br>>Hi Stefan,<br>><br>>On Mon, Jul 11, 2005 at 09:36:42AM +0000, Stefan Loos wrote:<br>> > It would be great if you can put me in cc. If you want I can try to assist<br>> > you. I'm at a point where I don't know what to try anymore. I think it<br>> > isn't easy for Henrik to find this issue - I have no coredumps and nothing<br>> > in the logfile what could help.<br>><br>>Yes, this is a really nasty problem. Vernon and I though we had it<br>>nailed down by the end of last week, but there's more to it than<br>>what we found then.<br>><br>>What kind of external scripts run on your clients, apart from the BB<br>>client ? The current suspicion is that this is triggered by a status<br>>message that is handled badly by Hobbit causing this lock-up. So I'm<br>>trying to see if there might be something in common between your setups.<br>><br>>And what kind of system are you running Hobbit on ? If Linux, which<br>>distribution ? Another suspicion I have is that this might be a<br>>problem with the implementation of SysV IPC semaphores.<br>><br>><br>>Regards,<br>>Henrik<br>><br>><br>>To unsubscribe from the hobbit list, send an e-mail to<br>>user-095ef1c764a2@xymon.invalid<br>><br>><br>
list Reif Jeffery M
This is a shot in the dark - I had a lockup problem on a BB system where the kernel was compiled with a different version compiler than was included with the system and there were some IPC-related changes in compiler versions.
▸
-----Original Message-----
From: Stefan Loos [mailto:user-dea24d965402@xymon.invalid] Sent: Monday, July 11, 2005 7:12 AM
To: user-ae9b8668bcde@xymon.invalid
Subject: Re: [hobbit] Status Unavailable - again
Hi Henrik,
we have several own-written scripts (mostly in perl) which monitor
oracle instances, bea weblogic servers. There is one for hardware monitoring -
HP (Intel) and Sun Servers (prtdiag based) and some are just the output of
a http request to the software running on that weblogic servers.
I have just one server for testing, it's a HP DL 360. We are running
Redhat Enterprise Server 3 but I've tried it with a SuSE 9.3 too.
Regards,
Stefan
<br><br><br>>From: user-ce4a2c883f75@xymon.invalid (Henrik Stoerner)<br>>Reply-To: user-ae9b8668bcde@xymon.invalid<br>>To: user-ae9b8668bcde@xymon.invalid<br>>Subject: Re: [hobbit]
Status Unavailable - again<br>>Date: Mon, 11 Jul 2005 13:08:59 +0200<br>><br>>Hi Stefan,<br>><br>>On Mon, Jul 11, 2005 at 09:36:42AM +0000, Stefan Loos wrote:<br>> > It would be great if
you can put me in cc. If you want I can try to assist<br>> > you. I'm
at a point where I don't know what to try anymore. I think it<br>> >
isn't easy for Henrik to find this issue - I have no coredumps and
nothing<br>> > in the logfile what could help.<br>><br>>Yes, this is a
really nasty problem. Vernon and I though we had it<br>>nailed down by the
end of last week, but there's more to it than<br>>what we found then.<br>><br>>What kind of external scripts run on your clients, apart from the BB<br>>client ? The current suspicion is that this is triggered by a status<br>>message that is handled badly by Hobbit
causing this lock-up. So I'm<br>>trying to see if there might be something in
common between your setups.<br>><br>>And what kind of system are
you running Hobbit on ? If Linux, which<br>>distribution ? Another
suspicion I have is that this might be a<br>>problem with the implementation of
SysV IPC semaphores.<br>><br>><br>>Regards,<br>>Henrik<br>><br>>
;<br>>To unsubscribe from the hobbit list, send an e-mail to<br>>user-095ef1c764a2@xymon.invalid<br>><br>><br>
list Vernon Everett
English has no shame. It will borrow beg or steal words from any language :-) Mensch, was taken from Yiddish - A hybrid of old German and Hebrew. From Onelook.com Quick definitions (mensch) noun: a decent responsible person with admirable characteristics From MSN Encarta mensch (plural mensch*en or mensch*es) or mensh (plural mensh*en or mensh*es) noun good person: somebody good, kind, decent, and honorable ( informal ) [Mid-20th century. Via Yiddish < Old High German mennisco "person, human"]
▸
-----Original Message----- From: Stefan Loos [mailto:user-dea24d965402@xymon.invalid] Sent: Monday, 11 July 2005 6:48 PM To: user-ae9b8668bcde@xymon.invalid Subject: RE: [hobbit] Status Unavailable - again Funny thing - your link brought me to http://germanenglishwords.com. Are you really using this words in english? I always thought that kindergarten and sauerkraut are the only few words which are used in english spoken countries.... Yes its really frustrating! So many others here in this mailing list don't have that problem. Are you using subpages in bb-hosts? Do you have many own-written monitoring scripts? Regards, Stefan <br><br><br>>From: "Vernon Everett" <user-99fc6b22a3a3@xymon.invalid><br>>Reply-To: user-ae9b8668bcde@xymon.invalid<br>>To: <user-ae9b8668bcde@xymon.invalid><br>>Subject: RE: [hobbit] Status Unavailable - again<br>>Date: Mon, 11 Jul 2005 18:01:45 +0800<br>><br>>Try http://www.onelook.com/?w=mensch&ls=a<br>>:-)<br>><br>>Once Hobbit hangs, it refuses to core.<br>>The only thing that will clobber it, is a kill -9<br>>When it hangs, it hangs good.<br>>The logs are devoid of useful info too. Even with the latest build.<br>><br>>This is probably the most frustrating issue I have ever seen.<br>><br>>Regards<br>> Vernon<br>><br>>-----Original Message-----<br>>From: Stefan Loos [mailto:user-dea24d965402@xymon.invalid]<br>>Sent: Monday, 11 July 2005 5:37 PM<br>>To: user-ae9b8668bcde@xymon.invalid<br>>Subject: RE: [hobbit] Status Unavailable - again<br>><br>>It would be great if you can put me in cc. If you want I can try to<br>>assist<br>>you. I'm at a point where I don't know what to try anymore. I think it<br>>isn't<br>>easy for Henrik to find this issue - I have no coredumps and nothing in<br>>the<br>>logfile what could help.<br>>And I never had any doubt that Henrik does a great job! (I think my<br>>English<br>>is not good enough to say it in other words)<br>>So if there is anything what I can do to solve this problem....<br>><br>>I've tried to lookup "mensch" but I think I'm using the wrong sites -<br>>german<br>>english dictionaries always recognize mensch as a
german word ;-)<br>><br>>Regards,<br>>Stefan<br>><br>><br><b
r><br>&gt;From: &quot;Vernon Everett&quot;<br>>&lt;user-99fc6b22a3a3@xymon.invalid&gt;<b
r>&gt;Reply-To:<br>>user-ae9b8668bcde@xymon.invalid<br>&gt;To: &lt;user-ae9b8668bcde@xymon.invalid&gt;<br>&gt;Subject:
RE:<br>>[hobbit] Status Unavailable - again<br>&gt;Date: Mon, 11 Jul 2005<br>>16:37:28<br>>+0800<br>&gt;<br>&gt;Yep
.<br>&gt;I did.<br>&gt;Between myself and<br>>Henrik, we<br>>have
▸
tried a number of versions, and a few<br>&gt;special diagnostic<br>>versions of Hobbit.<br>&gt;We have been
working mostly off-list because<br>>we<br>>have been exchanging<br>&gt;potentially confidential information, and<br>>don't<br>>believe that our failure<br>&gt;to diagnose a problem is of general<br>>interest. (I am sure a full account<br>&gt;of this sad tale will
be<br>>posted<br>>once it is resolved.)<br>&gt;<br>&gt;Kudos to Henrik though.<br>&gt;I<br>>think<br>>he has really tried. He
has worked tirelessly to try and<br>>resolve<br>&gt;this<br>>issue, which I believe
is very admirable, when you consider<br>>his<br>&gt;reward for
all this hard work.<br>&gt;Henrik is a true<br>>mensch.<br>>(Yes, that's an English word. Look it up)<br>&gt;<br>&gt;So far, we have<br>>not<br>>been able to identify the root cause of the problem.<br>&gt;Henrik was<br>>going<br>>to have a look
at some of the messages that came from<br>&gt;one of the<br>>hosts, and get back to me.<br>&gt;<br>&gt;Stefan, if you are interested<br>>in<br>>assisting us with this, then myself and<br>&gt;Henrik can cc you in our<br>>off-list exchanges.<br>&gt;<br>&gt;Have you got any theories
as to the<br>>cause?<br>&gt;<br>&gt;Regards<br>&a
mp;gt;<br>>Vernon<br>&gt;<br>&gt;<br>&g
t;<br>&gt;-----Original<br>>Message-----<br>&gt;F
rom: Stefan
Loos<br>>[mailto:user-dea24d965402@xymon.invalid]<br>&gt;Sent: Monday, 11 July 2005 4:21<br>>PM<br>&gt;To: user-ae9b8668bcde@xymon.invalid<br>&gt;Subject: RE: [hobbit] Status<br>>Unavailable<br>>- again<br>&gt;<br>&gt;did you use the hobbitd from the<br>>snapshot?<br>&gt;<br>&gt;<br>&g
t;_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _<br>>_ _<br>>_ _ _ _ _ _ _ _
▸
_ _ _ _ _ _ _ _<br>&gt;<br>&gt;NOTICE: This message and<br>>any<br>>attachments are confidential and may contain copyright<br>>material<br>&gt;of<br>>Australian Finance Group Limited or a third party. It is intended
solely<br>>for<br>>the purpose of the<br>&gt;addressee and any other named recipient.
If<br>>you<br>>are not the intended recipient, any use,<br>&gt;distribution, disclosure<br>>or<br>>copying
of this message is strictly prohibited. The confidentiality<br>>attached<br>&gt;to this message is not waived or lost by reason of the<br>>mistaken transmission or delivery
to any<br>&gt;unintended party. If you<br>><br>>have
received this message in error, please notify the author<br>>immediately<br>>or<br>&gt;contact Australian Finance Group on +61 8 9420<br>>7888.<br>&gt;<br>&gt;<br>&gt;T
o unsubscribe from the hobbit list, send<br>>an<br>>e-mail to<br>&gt;user-095ef1c764a2@xymon.invalid<br>&gt;<br&g
t;&gt;<br><br>><br>><br>><br>>To unsubscribe from the hobbit list, send an e-mail to<br>>user-095ef1c764a2@xymon.invalid<br>><br>><br>>_ _ _ _ _ _
▸
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _<br>><br>>NOTICE: This message and any attachments are
confidential and may contain copyright material<br>>of Australian Finance Group Limited or a third party. It is intended solely for the purpose of the<br>>addressee and any other named recipient. If you are not the intended recipient, any use,<br>>distribution, disclosure or copying
of this message is strictly prohibited. The confidentiality
attached<br>>to this message is not waived or lost by reason of the mistaken
transmission or delivery to any<br>>unintended party. If you have received this
message in error, please notify the author immediately or<br>>contact
Australian Finance Group on +61 8 9420 7888.<br>><br>><br>>To unsubscribe
from the hobbit list, send an e-mail to<br>>user-095ef1c764a2@xymon.invalid<br>><br>><br>
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
NOTICE: This message and any attachments are confidential and may contain copyright material of Australian Finance Group Limited or a third party. It is intended solely for the purpose of the addressee and any other named recipient. If you are not the intended recipient, any use, distribution, disclosure or copying of this message is strictly prohibited. The confidentiality attached
to this message is not waived or lost by reason of the mistaken transmission or delivery to any unintended party. If you have received this message in error, please notify the author immediately or contact Australian Finance Group on +61 8 9420 7888.
list Stefan Loos
Hello Jeffery, do you think we see this issue because the kernel was built with a different compiler than I was using for building the hobbit server? @Henrik - do you think this could be the problem? Regards, Stefan <br><br><br>>From: "Reif Jeffery M" <user-e9cc5d6c2490@xymon.invalid><br>>Reply-To: user-ae9b8668bcde@xymon.invalid<br>>To: user-ae9b8668bcde@xymon.invalid<br>>Subject: RE: [hobbit] Status Unavailable - again<br>>Date: Mon, 11 Jul 2005 07:20:12 -0500<br>><br>>This is a shot in the dark - I had a lockup problem on a BB system where<br>>the kernel was compiled with a different version compiler than was<br>>included with the system and there were some IPC-related changes in<br>>compiler versions.<br>><br>>-----Original Message-----<br>>From: Stefan Loos [mailto:user-dea24d965402@xymon.invalid]<br>>Sent: Monday, July 11, 2005 7:12 AM<br>>To: user-ae9b8668bcde@xymon.invalid<br>>Subject: Re: [hobbit] Status Unavailable - again<br>><br>>Hi Henrik,<br>><br>>we have several own-written scripts (mostly in perl) which monitor<br>>oracle<br>>instances, bea weblogic servers. There is one for hardware monitoring -<br>>HP<br>>(Intel) and Sun Servers (prtdiag based) and some are just the output of<br>>a<br>>http request to the software running on that weblogic servers.<br>>I have just one server for testing, it's a HP DL 360. We are running<br>>Redhat<br>>Enterprise Server 3 but I've tried it with a SuSE 9.3 too.<br>><br>>Regards,<br>><br>>Stefan<br>><br>><br><br><br>&gt;From: user-ce4a2c883f75@xymon.invalid (Henrik Stoerner)<br>&gt;Reply-To:<br>>user-ae9b8668bcde@xymon.invalid<br>&gt;To: user-ae9b8668bcde@xymon.invalid<br>&gt;Subject: Re: [hobbit]<br>>Status<br>>Unavailable - again<br>&gt;Date: Mon, 11 Jul 2005 13:08:59<br>>+0200<br>&gt;<br>&gt;Hi Stefan,<br>&gt;<br>&gt;On Mon, Jul 11, 2005 at<br>>09:36:42AM +0000, Stefan Loos wrote:<br>&gt; &gt; It would be great if<br>>you<br>>can put me in cc. If you want I can try to assist<br>&gt; &gt; you. I'm<br>>at a<br>>point where I don't know what to try anymore. I think it<br>&gt; &gt;<br>>isn't<br>>easy for Henrik to find this issue - I have no coredumps and<br>>nothing<br>&gt;<br>>&gt; in the logfile what could help.<br>&gt;<br>&gt;Yes, this is a<br>>really<br>>nasty problem. Vernon and I though we had it<br>&gt;nailed down by the<br>>end<br>>of last week, but there's more to it than<br>&gt;what we found<br>>then.<br>&gt;<br>&gt;What kind of external scripts run on your clients,<br>>apart from the BB<br>&gt;client ? The current suspicion is that this is<br>>triggered by a status<br>&gt;message that is handled badly by Hobbit<br>>causing<br>>this lock-up. So I'm<br>&gt;trying to see if there might be something in<br>><br>>common between your setups.<br>&gt;<br>&gt;And what kind of system are<br>>you<br>>running Hobbit on ? If Linux, which<br>&gt;distribution ? Another<br>>suspicion<br>>I have is that this might be a<br>&gt;problem with the implementation of<br>><br>>SysV IPC<br>>semaphores.<br>&gt;<br>&gt;<br>&gt;Regards,<br>&gt;Henrik<br>&gt;<br>&gt<br>>;<br>&gt;To<br>>unsubscribe from the hobbit list, send an e-mail<br>>to<br>&gt;user-095ef1c764a2@xymon.invalid<br>&gt;<br>&gt;<br><br>><br>><br>><br>>To unsubscribe from the hobbit list, send an e-mail to<br>>user-095ef1c764a2@xymon.invalid<br>><br>><br>><br>><br>><br>>To unsubscribe from the hobbit list, send an e-mail to<br>>user-095ef1c764a2@xymon.invalid<br>><br>><br>
list Reif Jeffery M
Stefan, I don't know if this is your problem. I just suggested it as a possibility for you to consider. I originally saw random lockups on a BB system after upgrading from Redhat 7 to (8 or 9, I don't remember). I thought this might be possible on other versions or distributions as well. The solution in my case was to either: 1) Down-grade the compiler to match the OS . 2) Recompile the kernel. 3) Change OS versions (sounds like you may have tried this). I hope this helps in some way. Good luck on your problem. Jeff
▸
-----Original Message-----
From: Stefan Loos [mailto:user-dea24d965402@xymon.invalid] Sent: Monday, July 11, 2005 8:24 AM
To: user-ae9b8668bcde@xymon.invalid
Subject: RE: [hobbit] Status Unavailable - again
Hello Jeffery,
do you think we see this issue because the kernel was built with a
different compiler than I was using for building the hobbit server?
@Henrik - do you think this could be the problem?
Regards,
Stefan
<br><br><br>>From: "Reif Jeffery M" <user-e9cc5d6c2490@xymon.invalid><br>>Reply-To: user-ae9b8668bcde@xymon.invalid<br>>To: user-ae9b8668bcde@xymon.invalid<br>>Subject: RE: [hobbit]
Status Unavailable - again<br>>Date: Mon, 11 Jul 2005 07:20:12 -0500<br>><br>>This is a shot in the dark - I had a lockup problem
on a BB system where<br>>the kernel was compiled with a different
version compiler than was<br>>included with the system and there were some IPC-related changes in<br>>compiler versions.<br>><br>>-----Original Message-----<br>>From: Stefan
Loos [mailto:user-dea24d965402@xymon.invalid]<br>>Sent: Monday, July 11, 2005 7:12
AM<br>>To: user-ae9b8668bcde@xymon.invalid<br>>Subject: Re: [hobbit] Status
Unavailable - again<br>><br>>Hi Henrik,<br>><br>>we have several
own-written scripts (mostly in perl) which monitor<br>>oracle<br>>instances,
bea weblogic servers. There is one for hardware monitoring -<br>>HP<br>>(Intel) and Sun Servers (prtdiag based) and some are
just the output of<br>>a<br>>http request to the software running on
that weblogic servers.<br>>I have just one server for testing, it's a HP
DL 360. We are running<br>>Redhat<br>>Enterprise Server 3 but I've
tried it with a SuSE 9.3 too.<br>><br>>Regards,<br>><br>>Stefan<br>><br>><br
><br><br>&gt;From: user-ce4a2c883f75@xymon.invalid (Henrik Stoerner)<br>&gt;Reply-To:<br>>user-ae9b8668bcde@xymon.invalid<br>&
;gt;To: user-ae9b8668bcde@xymon.invalid<br>&gt;Subject: Re: [hobbit]<br>>Status<br>>Unavailable - again<br>&gt;Date:
▸
Mon, 11 Jul 2005
13:08:59<br>>+0200<br>&gt;<br>&gt;Hi Stefan,<br>&gt;<br>&gt;On Mon, Jul 11, 2005 at<br>>09:36:42AM +0000, Stefan Loos wrote:<br>&gt;
&gt; It would be great if<br>>you<br>>can put me in cc. If you want I can
try to assist<br>&gt; &gt; you. I'm<br>>at a<br>>point
where I don't know what to try anymore. I think it<br>&gt; &gt;<br>>isn't<br>>easy for Henrik to find this issue - I have
no coredumps and<br>>nothing<br>&gt;<br>>&gt; in the logfile what could help.<br>&gt;<br>&gt;Yes, this is
a<br>>really<br>>nasty problem. Vernon and I though we had it<br>&gt;nailed down by the<br>>end<br>>of last week,
but there's more to it than<br>&gt;what we found<br>>then.<br>&gt;<br>&gt;What kind of
external scripts run on your clients,<br>>apart from the BB<br>&gt;client ? The current suspicion is that this is<br>>triggered by a status<br>&gt;message that is handled
badly by Hobbit<br>>causing<br>>this lock-up. So I'm<br>&gt;trying to see if there might be something in<br>><br>>common between your setups.<br>&gt;<br>&gt;And what kind of system are<br>>you<br>>running Hobbit on ? If Linux, which<br>&gt;distribution ? Another<br>>suspicion<br>>I
have is that this might be a<br>&gt;problem with the implementation
of<br>><br>>SysV IPC<br>>semaphores.<br>&gt;<br>&gt;<br>&
;gt;Regards,<br>&gt;Henrik<br>&gt;<br>&gt<
br>>;<br>&gt;To<br>>unsubscribe from the hobbit list, send an e-mail<br>>to<br>&gt;user-095ef1c764a2@xymon.invalid<br>&a
mp;gt;<br>&gt;<br><br>><br>><br>><br>>To unsubscribe from the hobbit list, send an e-mail to<br>>user-095ef1c764a2@xymon.invalid<br>><br>><br>><br>><br>
><br>>To unsubscribe from the hobbit list, send an e-mail to<br>>user-095ef1c764a2@xymon.invalid<br>><br>><br>
list Henrik Størner
▸
In <user-2d5f2349ef80@xymon.invalid> "Stefan Loos" <user-dea24d965402@xymon.invalid> writes:
do you think we see this issue because the kernel was built with a different compiler than I was using for building the hobbit server?
@Henrik - do you think this could be the problem?
No, I dont think so. Compiler versions should not matter, in the end it's just binary code. One thing I do have in mind as a potential source of this problem is the fact that newer Linux distributions tend to stuff all sorts of new "scalability" features into their kernels and libc libraries. This could mean that they come with versions of the kernel and/or libraries that have bugs which Hobbit happens to trigger - some of the features that Hobbit uses are not terribly common for applications, so there could be bugs that just haven't been discovered yet. But for now, let's assume that the bug is in Hobbit (until we can prove otherwise). Henrik