Xymon Mailing List Archive search

Status Unavailable

43 messages in this thread

list Vernon Everett · Fri, 1 Jul 2005 09:03:50 +0800 ·
Hi all

Here's an interesting one.

I have been running Hobbit for a few months now, and everything has been
good.
Upgraded to 4.0.3, and all was well.

The suddenly, a few days ago, I started getting intermittant errors when
I click on the faces/blobs for more information.
Sometimes it works, and I get what I want.
Other times, it times out and I get "Status not available" on a white
screen.

I tried the following
1. Restarting Hobbit.
2. Restarting the web server
3. Rebooting the server
4. Upgrading to Hobbit 4.0.4

Still no improvement.

Looking again, I can rule out the Web server, because I get the main
page, and the column info links on the column description works.
I think I can rule out the network, because all the monitored machines
are reachable and pingable from Hobbit.
I think I ruled out a bug by upgrading to 4.0.4

I am still getting e-mail alerts (some of the time) and graphs are being
updated (partially).
When it does the "Status not available" trick for long periods of time,
it doesn't upgrade the status or the graphs during that period.

Any ideas, pointers or tips?

Regards
   Vernon


_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

NOTICE: This message and any attachments are confidential and may contain copyright material 
of Australian Finance Group Limited or a third party. It is intended solely for the purpose of the 
addressee and any other named recipient. If you are not the intended recipient, any use, 
distribution, disclosure or copying of this message is strictly prohibited. The confidentiality attached
to this message is not waived or lost by reason of the mistaken transmission or delivery to any 
unintended party. If you have received this message in error, please notify the author immediately or 
contact Australian Finance Group on +61 8 9420 7888.
list Vernon Everett · Fri, 1 Jul 2005 09:09:31 +0800 ·
Something else I noticed.
Sometimes, the main screen gives me the header and the footer, and no
info.
No blobs/smileys, no columns, no server names, nothing.
Just the menu, the logo, "CURRENT STATUS" and the date at the top, and a
line.
Then the footer. Line and Hobbit version.
It is also on a green background, even though I have at least one test
that is reporting yellow.

Regards
   Vernon
quoted from Vernon Everett

-----Original Message-----
From: Vernon Everett [mailto:user-99fc6b22a3a3@xymon.invalid] 
Sent: Friday, 1 July 2005 9:04 AM
To: user-ae9b8668bcde@xymon.invalid
Subject: [hobbit] Status Unavailable

Hi all

Here's an interesting one.

I have been running Hobbit for a few months now, and everything has been
good.
Upgraded to 4.0.3, and all was well.

The suddenly, a few days ago, I started getting intermittant errors when
I click on the faces/blobs for more information.
Sometimes it works, and I get what I want.
Other times, it times out and I get "Status not available" on a white
screen.

I tried the following
1. Restarting Hobbit.
2. Restarting the web server
3. Rebooting the server
4. Upgrading to Hobbit 4.0.4

Still no improvement.

Looking again, I can rule out the Web server, because I get the main
page, and the column info links on the column description works.
I think I can rule out the network, because all the monitored machines
are reachable and pingable from Hobbit.
I think I ruled out a bug by upgrading to 4.0.4

I am still getting e-mail alerts (some of the time) and graphs are being
updated (partially).
When it does the "Status not available" trick for long periods of time,
it doesn't upgrade the status or the graphs during that period.

Any ideas, pointers or tips?

Regards
   Vernon


_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
_ _

NOTICE: This message and any attachments are confidential and may
contain copyright material of Australian Finance Group Limited or a
third party. It is intended solely for the purpose of the addressee and
any other named recipient. If you are not the intended recipient, any
use, distribution, disclosure or copying of this message is strictly
prohibited. The confidentiality attached to this message is not waived
or lost by reason of the mistaken transmission or delivery to any
unintended party. If you have received this message in error, please
notify the author immediately or contact Australian Finance Group on +61
8 9420 7888.


_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

NOTICE: This message and any attachments are confidential and may contain copyright material 
of Australian Finance Group Limited or a third party. It is intended solely for the purpose of the 
addressee and any other named recipient. If you are not the intended recipient, any use, 
distribution, disclosure or copying of this message is strictly prohibited. The confidentiality attached
to this message is not waived or lost by reason of the mistaken transmission or delivery to any 
unintended party. If you have received this message in error, please notify the author immediately or 
contact Australian Finance Group on +61 8 9420 7888.
list Vernon Everett · Fri, 1 Jul 2005 09:55:52 +0800 ·
Also found this in the hobbitlaunch.log

Don't understand it much, but it might help you guys.


2005-07-01 09:35:10 Setting up hobbitd channels
2005-07-01 09:35:10 Setting up logfiles
2005-07-01 09:35:10 Task hobbitd started with PID 6563
2005-07-01 09:35:15 Task bbhistory started with PID 6567
2005-07-01 09:35:15 Task bbenadis started with PID 6569
2005-07-01 09:35:15 Task bbpage started with PID 6571
2005-07-01 09:35:15 Task larrdstatus started with PID 6572
2005-07-01 09:35:15 Task larrddata started with PID 6573
2005-07-01 09:36:05 Task bbdisplay started with PID 6621
2005-07-01 09:36:05 Task bbretest started with PID 6622
2005-07-01 09:37:05 Task bbdisplay started with PID 6660
2005-07-01 09:37:05 Task bbcombotest started with PID 6661
2005-07-01 09:37:05 Task bbnet started with PID 6662
2005-07-01 09:37:05 Task bbretest started with PID 6663
2005-07-01 09:38:05 Task bbdisplay started with PID 6701
2005-07-01 09:38:05 Task bbretest started with PID 6702
2005-07-01 09:38:10 Heartbeat lost for task hobbitd, bouncing it
2005-07-01 09:38:15 Heartbeat lost for task hobbitd, killing it
2005-07-01 09:38:15 Task hobbitd terminated by signal 9
2005-07-01 09:38:15 Loading hostnames
2005-07-01 09:38:15 Loading saved state
2005-07-01 09:38:15 Task hobbitd started with PID 6709
2005-07-01 09:38:15 Setting up network listener on 0.0.0.0:1984
2005-07-01 09:38:15 Setting up signal handlers
2005-07-01 09:38:15 Setting up hobbitd channels
2005-07-01 09:38:15 Setting up logfiles
2005-07-01 09:38:20 Task bbhistory started with PID 6713
2005-07-01 09:38:20 Task bbenadis started with PID 6715
2005-07-01 09:38:20 Task bbpage started with PID 6716
2005-07-01 09:38:20 Task larrdstatus started with PID 6717
2005-07-01 09:38:20 Task larrddata started with PID 6718
2005-07-01 09:39:05 Task bbdisplay started with PID 6750
2005-07-01 09:39:05 Task bbretest started with PID 6751
2005-07-01 09:40:07 Task bbdisplay started with PID 6814
2005-07-01 09:40:07 Task bbretest started with PID 6815
2005-07-01 09:41:08 Task bbdisplay started with PID 6852
2005-07-01 09:41:08 Task bbretest started with PID 6853
2005-07-01 09:42:08 Task bbdisplay started with PID 6890
2005-07-01 09:42:08 Task bbcombotest started with PID 6891
2005-07-01 09:42:08 Task bbnet started with PID 6892
2005-07-01 09:42:08 Task bbretest started with PID 6893
2005-07-01 09:42:13 Heartbeat lost for task hobbitd, bouncing it
2005-07-01 09:42:18 Heartbeat lost for task hobbitd, killing it
2005-07-01 09:42:18 Task hobbitd terminated by signal 9
2005-07-01 09:42:18 Task bbdisplay terminated by signal 15
2005-07-01 09:42:18 Task bbnet terminated by signal 15
2005-07-01 09:48:18 Loading hostnames
2005-07-01 09:48:18 Loading saved state
2005-07-01 09:48:18 Setting up network listener on 0.0.0.0:1984
2005-07-01 09:48:18 Setting up signal handlers
2005-07-01 09:48:18 Setting up hobbitd channels
2005-07-01 09:48:18 Setting up logfiles
2005-07-01 09:48:18 Task hobbitd started with PID 7166
2005-07-01 09:48:23 Task bbhistory started with PID 7170
2005-07-01 09:48:23 Task bbenadis started with PID 7172
2005-07-01 09:48:23 Task bbpage started with PID 7174
2005-07-01 09:48:23 Task larrdstatus started with PID 7175
2005-07-01 09:48:23 Task larrddata started with PID 7176
2005-07-01 09:48:23 Task bbdisplay started with PID 7177
2005-07-01 09:48:23 Task bbcombotest started with PID 7178
2005-07-01 09:48:23 Task bbnet started with PID 7179
2005-07-01 09:48:23 Task bbretest started with PID 7180
2005-07-01 09:48:23 Task larrdcolumn started with PID 7184
2005-07-01 09:48:23 Task infocolumn started with PID 7186
2005-07-01 09:49:27 Task bbdisplay started with PID 7242
2005-07-01 09:49:27 Task bbretest started with PID 7243
2005-07-01 09:50:27 Task bbdisplay started with PID 7293
2005-07-01 09:50:27 Task bbretest started with PID 7294
2005-07-01 09:51:22 Heartbeat lost for task hobbitd, bouncing it
2005-07-01 09:51:27 Heartbeat lost for task hobbitd, killing it
2005-07-01 09:51:27 Task bbdisplay started with PID 7331
2005-07-01 09:51:27 Task bbretest started with PID 7332
2005-07-01 09:51:27 Task hobbitd terminated by signal 9
2005-07-01 09:51:27 Loading hostnames
2005-07-01 09:51:27 Loading saved state
2005-07-01 09:51:27 Setting up network listener on 0.0.0.0:1984
2005-07-01 09:51:27 Setting up signal handlers
2005-07-01 09:51:27 Setting up hobbitd channels
2005-07-01 09:51:27 FATAL: hobbitd sees clientcount 1, should be 0
Check for hanging hobbitd_channel processes or stale semaphores
2005-07-01 09:51:27 Cannot setup status channel
2005-07-01 09:51:27 Task hobbitd started with PID 7333
2005-07-01 09:51:27 Task bbdisplay terminated by signal 15
2005-07-01 09:51:27 Task hobbitd terminated, status 1
2005-07-01 09:51:32 Loading hostnames
2005-07-01 09:51:32 Loading saved state
2005-07-01 09:51:32 Setting up network listener on 0.0.0.0:1984
2005-07-01 09:51:32 Setting up signal handlers
2005-07-01 09:51:32 Setting up hobbitd channels
2005-07-01 09:51:32 Setting up logfiles
2005-07-01 09:51:32 Task hobbitd started with PID 7337
2005-07-01 09:51:37 Task bbhistory started with PID 7341
2005-07-01 09:51:37 Task bbenadis started with PID 7343
2005-07-01 09:51:37 Task bbpage started with PID 7345
2005-07-01 09:51:37 Task larrdstatus started with PID 7346
2005-07-01 09:51:37 Task larrddata started with PID 7347
2005-07-01 09:52:30 Task bbdisplay started with PID 7411
2005-07-01 09:52:30 Task bbretest started with PID 7412
2005-07-01 09:53:24 Task bbcombotest started with PID 7457
2005-07-01 09:53:24 Task bbnet started with PID 7458
quoted from Vernon Everett

-----Original Message-----
From: Vernon Everett [mailto:user-99fc6b22a3a3@xymon.invalid] 
Sent: Friday, 1 July 2005 9:04 AM
To: user-ae9b8668bcde@xymon.invalid
Subject: [hobbit] Status Unavailable

Hi all

Here's an interesting one.

I have been running Hobbit for a few months now, and everything has been
good.
Upgraded to 4.0.3, and all was well.

The suddenly, a few days ago, I started getting intermittant errors when
I click on the faces/blobs for more information.
Sometimes it works, and I get what I want.
Other times, it times out and I get "Status not available" on a white
screen.

I tried the following
1. Restarting Hobbit.
2. Restarting the web server
3. Rebooting the server
4. Upgrading to Hobbit 4.0.4

Still no improvement.

Looking again, I can rule out the Web server, because I get the main
page, and the column info links on the column description works.
I think I can rule out the network, because all the monitored machines
are reachable and pingable from Hobbit.
I think I ruled out a bug by upgrading to 4.0.4

I am still getting e-mail alerts (some of the time) and graphs are being
updated (partially).
When it does the "Status not available" trick for long periods of time,
it doesn't upgrade the status or the graphs during that period.

Any ideas, pointers or tips?

Regards
   Vernon


_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
_ _

NOTICE: This message and any attachments are confidential and may
contain copyright material of Australian Finance Group Limited or a
third party. It is intended solely for the purpose of the addressee and
any other named recipient. If you are not the intended recipient, any
use, distribution, disclosure or copying of this message is strictly
prohibited. The confidentiality attached to this message is not waived
or lost by reason of the mistaken transmission or delivery to any
unintended party. If you have received this message in error, please
notify the author immediately or contact Australian Finance Group on +61
8 9420 7888.


_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

NOTICE: This message and any attachments are confidential and may contain copyright material 
of Australian Finance Group Limited or a third party. It is intended solely for the purpose of the 
addressee and any other named recipient. If you are not the intended recipient, any use, 
distribution, disclosure or copying of this message is strictly prohibited. The confidentiality attached
to this message is not waived or lost by reason of the mistaken transmission or delivery to any 
unintended party. If you have received this message in error, please notify the author immediately or 
contact Australian Finance Group on +61 8 9420 7888.
list Vernon Everett · Fri, 1 Jul 2005 09:57:42 +0800 ·
Hi Henrik 

Been watching this carefully, and seeing hobbitd test go yellow all the
time.
Is there a list of what all the numbers mean, and what they should be in
the hobbitd results?

Mine looked like this, which is all very nice, but what so they mean?

Statistics for Hobbit daemon
Up since 01-Jul-2005 07:31:02 (0 days, 00:05:01)

Incoming messages      :        145
- status               :        122
- combo                :         18
- page                 :          0
- summary              :          0
- data                 :          0
- notes                :          0
- enable               :          0
- disable              :          0
- ack                  :          0
- config               :          0
- query                :          0
- hobbitdboard         :          5
- hobbitdlog           :          0
- drop                 :          0
- rename               :          0
- dummy                :          0
- notify               :          0
- schedule             :          0
- Bogus/Timeouts       :          0
Incoming messages/sec  :          0 (average last 301 seconds)

status channel messages:        120 (1 readers)
stachg channel messages:          0 (1 readers)
page   channel messages:          1 (1 readers)
data   channel messages:          0 (1 readers)
notes  channel messages:          0 (0 readers)
enadis channel messages:          0 (1 readers)


Latest errormessages:
Loading hostnames
Loading saved state
Setting up network listener on 0.0.0.0:1984
Setting up signal handlers
Setting up hobbitd channels
Setting up logfiles
Setup complete
quoted from Vernon Everett


-----Original Message-----
From: Vernon Everett [mailto:user-99fc6b22a3a3@xymon.invalid] 
Sent: Friday, 1 July 2005 9:04 AM
To: user-ae9b8668bcde@xymon.invalid
Subject: [hobbit] Status Unavailable

Hi all

Here's an interesting one.

I have been running Hobbit for a few months now, and everything has been
good.
Upgraded to 4.0.3, and all was well.

The suddenly, a few days ago, I started getting intermittant errors when
I click on the faces/blobs for more information.
Sometimes it works, and I get what I want.
Other times, it times out and I get "Status not available" on a white
screen.

I tried the following
1. Restarting Hobbit.
2. Restarting the web server
3. Rebooting the server
4. Upgrading to Hobbit 4.0.4

Still no improvement.

Looking again, I can rule out the Web server, because I get the main
page, and the column info links on the column description works.
I think I can rule out the network, because all the monitored machines
are reachable and pingable from Hobbit.
I think I ruled out a bug by upgrading to 4.0.4

I am still getting e-mail alerts (some of the time) and graphs are being
updated (partially).
When it does the "Status not available" trick for long periods of time,
it doesn't upgrade the status or the graphs during that period.

Any ideas, pointers or tips?

Regards
   Vernon


_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
_ _

NOTICE: This message and any attachments are confidential and may
contain copyright material of Australian Finance Group Limited or a
third party. It is intended solely for the purpose of the addressee and
any other named recipient. If you are not the intended recipient, any
use, distribution, disclosure or copying of this message is strictly
prohibited. The confidentiality attached to this message is not waived
or lost by reason of the mistaken transmission or delivery to any
unintended party. If you have received this message in error, please
notify the author immediately or contact Australian Finance Group on +61
8 9420 7888.


_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

NOTICE: This message and any attachments are confidential and may contain copyright material 
of Australian Finance Group Limited or a third party. It is intended solely for the purpose of the 
addressee and any other named recipient. If you are not the intended recipient, any use, 
distribution, disclosure or copying of this message is strictly prohibited. The confidentiality attached
to this message is not waived or lost by reason of the mistaken transmission or delivery to any 
unintended party. If you have received this message in error, please notify the author immediately or 
contact Australian Finance Group on +61 8 9420 7888.
list Henrik Størner · Fri, 1 Jul 2005 07:37:44 +0200 ·
Hi Vernon,

could you try removing the HEARTBEAT line from the first entry in 
hobbitlaunch.cfg ?

It looks like your hobbitd process is being bounced frequently.
I've seen that happen for no apparent reason when the heartbeat
check has been enabled - on some systems.


Regards,
Henrik
list Vernon Everett · Fri, 1 Jul 2005 13:51:47 +0800 ·
Hi Henrik

I removed the HEARTBEAT ine, and restarted.
No change. :-(

In case it helps, I am runnig Mandrake 10.1

Regards
   Vernon 
quoted from Vernon Everett
-----Original Message-----
From: Henrik Stoerner [mailto:user-ce4a2c883f75@xymon.invalid] Sent: Friday, 1 July 2005 1:38 PM
To: user-ae9b8668bcde@xymon.invalid
Subject: Re: [hobbit] Status Unavailable

Hi Vernon,

could you try removing the HEARTBEAT line from the first entry in
hobbitlaunch.cfg ?

It looks like your hobbitd process is being bounced frequently.
I've seen that happen for no apparent reason when the heartbeat check
has been enabled - on some systems.


Regards,
Henrik


_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

NOTICE: This message and any attachments are confidential and may contain copyright material of Australian Finance Group Limited or a third party. It is intended solely for the purpose of the addressee and any other named recipient. If you are not the intended recipient, any use, distribution, disclosure or copying of this message is strictly prohibited. The confidentiality attached
to this message is not waived or lost by reason of the mistaken transmission or delivery to any unintended party. If you have received this message in error, please notify the author immediately or contact Australian Finance Group on +61 8 9420 7888.
list Vernon Everett · Fri, 1 Jul 2005 14:30:22 +0800 ·
Found this in /var/log/messages

Jul  1 14:01:00 pengo CROND[4492]: (root) CMD (nice -n 19 run-parts
/etc/cron.hourly)
Jul  1 14:01:01 pengo msec: changed mode of
/var/log/hobbit/hobbitlaunch.pid from 664 to 640
Jul  1 14:01:01 pengo msec: changed mode of /var/log/hobbit/hobbitd.pid
from 664 to 640
Jul  1 14:01:01 pengo msec: changed mode of /var/log/wtmp from 664 to
640
Jul  1 14:01:01 pengo msec: changed group of /var/log/wtmp from utmp to
adm
Jul  1 14:01:01 pengo msec: changed mode of /var/log/Xorg.0.log from 644
to 640
Jul  1 14:01:01 pengo msec: changed group of /var/log/Xorg.0.log from
root to adm
Jul  1 14:06:48 pengo su(pam_unix)[4751]: session opened for user hobbit
by root(uid=0)
Jul  1 14:06:48 pengo su[4751]: pam_xauth: error creating temporary file
`/usr/lib/hobbit/.xauthyyUFNl': Permission denied
Jul  1 14:06:58 pengo su(pam_unix)[4751]: session closed for user hobbit
Jul  1 14:23:13 pengo crontab[5400]: (root) LIST (root)

Any help?
quoted from Vernon Everett


-----Original Message-----
From: Henrik Stoerner [mailto:user-ce4a2c883f75@xymon.invalid] 
Sent: Friday, 1 July 2005 1:38 PM
To: user-ae9b8668bcde@xymon.invalid
Subject: Re: [hobbit] Status Unavailable

Hi Vernon,

could you try removing the HEARTBEAT line from the first entry in
hobbitlaunch.cfg ?

It looks like your hobbitd process is being bounced frequently.
I've seen that happen for no apparent reason when the heartbeat check
has been enabled - on some systems.


Regards,
Henrik


_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

NOTICE: This message and any attachments are confidential and may contain copyright material 
of Australian Finance Group Limited or a third party. It is intended solely for the purpose of the 
addressee and any other named recipient. If you are not the intended recipient, any use, 
distribution, disclosure or copying of this message is strictly prohibited. The confidentiality attached
to this message is not waived or lost by reason of the mistaken transmission or delivery to any 
unintended party. If you have received this message in error, please notify the author immediately or 
contact Australian Finance Group on +61 8 9420 7888.
list Henrik Størner · Fri, 1 Jul 2005 08:40:22 +0200 ·
quoted from Vernon Everett
On Fri, Jul 01, 2005 at 01:51:47PM +0800, Vernon Everett wrote:
I removed the HEARTBEAT ine, and restarted.
No change. :-(

In case it helps, I am runnig Mandrake 10.1
OK - something similar did happen on my own system a few days
ago, but it was so bizarre I wonder if it could happen on two
boxes in the same week :-) The Linux kernel was leaking memory, so
eventually it ran out of network bufferspace and Hobbit couldn't
send responses anywhere.

Could you try running "dmesg" and see if there are any "failed
allocation" messages at the bottom ? This should really only list
the messages you see during boot-up and any hardware detection
that has happened.

Also, send me a "vmstat 4 20" output from the box.

If you run 

   ~hobbit/server/bin/bb 127.0.0.1 "hobbitdboard"

does that hang ? What if you do

   ~hobbit/server/bin/bb 127.0.0.1 "hobbitdboard host=YOUR.HOBBIT.HOSTNAME"


Regards,
Henrik
list Vernon Everett · Fri, 1 Jul 2005 15:25:30 +0800 ·
Hi Henrik

Thanks for helping on this.
I rebooted this morning. Could the memory leak still effect me in that
short time?

No "failed allocation" in dmesg output.
Do you want the full output?

[root at pengo log]# vmstat 4 20
procs -----------memory---------- ---swap-- -----io---- --system--
----cpu----
 r  b   swpd   free   buff  cache   si   so    bi    bo   in    cs us sy
id wa
 0  0      0  67916  14428  92136    0    0    19     5 1025   161  1  1
98  1
 0  0      0  67852  14428  92136    0    0     0     0 1024   150  0  1
99  0
 0  0      0  67852  14436  92136    0    0     0     5 1031   157  0  1
99  0
 0  0      0  67852  14444  92136    0    0     0    12 1028   148  0  0
100  0
 0  0      0  67852  14444  92136    0    0     0     2 1025   152  0  0
99  0
 0  0      0  67852  14448  92136    0    0     0     1 1024   154  0  1
99  0
 0  0      0  67852  14448  92136    0    0     0     0 1026   145  0  1
100  0
 0  0      0  67796  14448  92136    0    0     0     0 1028   157  1  1
99  0
 0  0      0  67796  14448  92136    0    0     0     0 1023   149  0  0
100  0
 0  0      0  67796  14456  92136    0    0     0     3 1024   155  1  1
99  0
 0  0      0  67796  14456  92140    0    0     0     0 1037   157  0  1
99  0
 0  0      0  67796  14468  92140    0    0     0    17 1026   150  0  1
99  0
 0  0      0  67796  14468  92140    0    0     0     0 1022   157  0  1
99  0
 0  0      0  67796  14476  92140    0    0     0     4 1022   148  0  0
100  0
 0  0      0  67796  14476  92140    0    0     0     0 1023   157  1  1
99  0
 0  0      0  67796  14476  92140    0    0     0     0 1021   152  0  1
100  0
 0  0      0  67796  14476  92140    0    0     0     0 1019   147  1  0
99  0
 0  0      0  67796  14476  92140    0    0     0     6 1026   153  0  1
99  0
 0  0      0  67796  14492  92140    0    0     2    12 1023   151  0  0
92  8
 0  0      0  67796  14492  92140    0    0     0     0 1024   155  0  1
99  0

All these commands returned to command prompt with the following error
message.
As user hobbit
[hobbit at pengo hobbit]$ server/bin/bb 127.0.0.1 "hobbitdboard"
2005-07-01 15:21:45 Whoops ! bb failed to send message - timeout

[hobbit at pengo hobbit]$ server/bin/bb 127.0.0.1 "hobbitdboard host=pengo"
2005-07-01 15:21:00 Whoops ! bb failed to send message - timeout
[hobbit at pengo hobbit]$ server/bin/bb 127.0.0.1 "hobbitdboard
host=pengo.afgonlin.com.au"
2005-07-01 15:21:30 Whoops ! bb failed to send message - timeout

As root
[root at pengo log]# /usr/lib/hobbit/server/bin/bb 127.0.0.1 "hobbitdboard"
2005-07-01 15:17:41 Whoops ! bb failed to send message - timeout

[root at pengo log]# /usr/lib/hobbit/server/bin/bb 127.0.0.1 "hobbitdboard
host=pengo"
2005-07-01 15:18:35 Whoops ! bb failed to send message - timeout
[root at pengo log]# /usr/lib/hobbit/server/bin/bb 127.0.0.1 "hobbitdboard
host=pengo.afgonlin.com.au"
2005-07-01 15:18:48 Whoops ! bb failed to send message - timeout
quoted from Henrik Størner


--- 

-----Original Message-----
From: Henrik Stoerner [mailto:user-ce4a2c883f75@xymon.invalid] 
Sent: Friday, 1 July 2005 2:40 PM
To: user-ae9b8668bcde@xymon.invalid
Subject: Re: [hobbit] Status Unavailable

On Fri, Jul 01, 2005 at 01:51:47PM +0800, Vernon Everett wrote:
I removed the HEARTBEAT ine, and restarted.
No change. :-(

In case it helps, I am runnig Mandrake 10.1
OK - something similar did happen on my own system a few days ago, but
it was so bizarre I wonder if it could happen on two boxes in the same
week :-) The Linux kernel was leaking memory, so eventually it ran out
of network bufferspace and Hobbit couldn't send responses anywhere.

Could you try running "dmesg" and see if there are any "failed
allocation" messages at the bottom ? This should really only list the
messages you see during boot-up and any hardware detection that has
happened.

Also, send me a "vmstat 4 20" output from the box.

If you run 

   ~hobbit/server/bin/bb 127.0.0.1 "hobbitdboard"

does that hang ? What if you do

   ~hobbit/server/bin/bb 127.0.0.1 "hobbitdboard
host=YOUR.HOBBIT.HOSTNAME"


Regards,
Henrik


_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

NOTICE: This message and any attachments are confidential and may contain copyright material 
of Australian Finance Group Limited or a third party. It is intended solely for the purpose of the 
addressee and any other named recipient. If you are not the intended recipient, any use, 
distribution, disclosure or copying of this message is strictly prohibited. The confidentiality attached
to this message is not waived or lost by reason of the mistaken transmission or delivery to any 
unintended party. If you have received this message in error, please notify the author immediately or 
contact Australian Finance Group on +61 8 9420 7888.
list Henrik Størner · Fri, 1 Jul 2005 09:37:45 +0200 ·
quoted from Vernon Everett
On Fri, Jul 01, 2005 at 03:25:30PM +0800, Vernon Everett wrote:
Thanks for helping on this.
I rebooted this morning. Could the memory leak still effect me in that
short time?
Probably not. Just wanted to rule out this possibility.
quoted from Vernon Everett
No "failed allocation" in dmesg output.
Do you want the full output?
No, I dont think that is necessary.
[root at pengo log]# vmstat 4 20
And your system is mostly idle with no swap or disk activity.
quoted from Vernon Everett
[hobbit at pengo hobbit]$ server/bin/bb 127.0.0.1 "hobbitdboard"
2005-07-01 15:21:45 Whoops ! bb failed to send message - timeout
Could you try running "strace -p <process-ID of the hobbitd process>"
for a minute or two and send me the output, then do a 
"kill -6 <process-id>" and mail me the core-file from
~hobbit/server/tmp/ together with the ~hobbit/server/bin/hobbitd file ?

Also, after this try adding a "--debug" to the hobbitd commandline
in hobbitlaunch.cfg. Let it run for a while and then mail me the
hobbitd.log file.

This bug sounds a bit nasty, I think ....


Regards,
Henrik
list Vernon Everett · Fri, 1 Jul 2005 16:56:38 +0800 ·
Hi Henrik

It should be idle. All the system does is run hobbit. :-)

Hobbitd is currently dead in the water.
	[root at pengo log]# strace -p 3025
	Process 3025 attached - interrupt to quit
	futex(0x40141b20, FUTEX_WAIT, 2, NULL

And it's been like this a while.
When I did the kill -6 I got this.
	[root at pengo log]# strace -p 3025
	Process 3025 attached - interrupt to quit
	futex(0x40141b20, FUTEX_WAIT, 2, NULL)  = -1 EINTR (Interrupted
system call)
	--- SIGABRT (Aborted) @ 0 (0) ---
	Process 3025 detached
Which I suppose was expected :-)

I restarted it, and got this.
	[root at pengo etc]# strace -p 9223
	Process 9223 attached - interrupt to quit
	semop(32769, 0xbfffe3a0, 1
Nope, there is nothing I forgot to cut and paste.
That really was it.

And this shit just gets stranger and stranger.
It isn't dumping core.
I hit it with a kill -6 and nothing happens.
I then thought maybe we were both mistaken, and had the command wrong or
my linux was defaulted to not core, so I started vi in a session and did
a kill -6 on that. That dumped?!
Hobbit isn't dumping.

I rebooted and tried again.
I managed to get a nice strace output - see attached - but still no damn
core.

OK, I added debug, and restarted.
When I went to check the logs, I found this in hobbitlaunch.log.
---snip---
2005-07-01 16:37:21 Loading tasklist configuration from
/usr/lib/hobbit/server/etc/hobbitlaunch.cfg
2005-07-01 16:37:21 Loading hostnames
2005-07-01 16:37:21 Loading saved state
2005-07-01 16:37:21 Setting up network listener on 0.0.0.0:1984
2005-07-01 16:37:21 Cannot bind to listen socket (Address already in
use)
2005-07-01 16:37:21 Task hobbitd started with PID 4761
2005-07-01 16:37:26 Task hobbitd terminated, status 1
2005-07-01 16:37:26 Loading hostnames
2005-07-01 16:37:26 Loading saved state
2005-07-01 16:37:26 Task hobbitd started with PID 4765
2005-07-01 16:37:26 Setting up network listener on 0.0.0.0:1984
2005-07-01 16:37:26 Cannot bind to listen socket (Address already in
use)
2005-07-01 16:37:26 Task hobbitd terminated, status 1
2005-07-01 16:37:31 Loading hostnames
2005-07-01 16:37:31 Loading saved state
2005-07-01 16:37:31 Task hobbitd started with PID 4770
2005-07-01 16:37:31 Setting up network listener on 0.0.0.0:1984
2005-07-01 16:37:31 Cannot bind to listen socket (Address already in
use)
2005-07-01 16:37:31 Task hobbitd terminated, status 1
2005-07-01 16:37:36 Task hobbitd started with PID 4774
2005-07-01 16:37:36 Loading hostnames
2005-07-01 16:37:36 Loading saved state
2005-07-01 16:37:36 Setting up network listener on 0.0.0.0:1984
2005-07-01 16:37:36 Cannot bind to listen socket (Address already in
use)
2005-07-01 16:37:36 Task hobbitd terminated, status 1
2005-07-01 16:37:41 Task hobbitd started with PID 4778
2005-07-01 16:37:41 Loading hostnames
2005-07-01 16:37:41 Loading saved state
2005-07-01 16:37:41 Setting up network listener on 0.0.0.0:1984
2005-07-01 16:37:41 Cannot bind to listen socket (Address already in
use)
2005-07-01 16:37:41 Task hobbitd terminated, status 1
2005-07-01 16:37:46 Task hobbitd started with PID 4783
2005-07-01 16:37:46 Loading hostnames
2005-07-01 16:37:46 Loading saved state
2005-07-01 16:37:46 Setting up network listener on 0.0.0.0:1984
2005-07-01 16:37:46 Cannot bind to listen socket (Address already in
use)
2005-07-01 16:37:46 Task hobbitd terminated, status 1
---snip---

Looks like a clue.
I will add the output of netstat -a

Got the hobbitd.log file for you too.

Let me know if there is anything else I can get you.

Regards
    Vernon

P.S. Your cold one is quickly becoming many cold ones if you ever get to
Perth
quoted from Henrik Størner


-----Original Message-----
From: Henrik Stoerner [mailto:user-ce4a2c883f75@xymon.invalid] 
Sent: Friday, 1 July 2005 3:38 PM
To: user-ae9b8668bcde@xymon.invalid
Subject: Re: [hobbit] Status Unavailable

On Fri, Jul 01, 2005 at 03:25:30PM +0800, Vernon Everett wrote:
Thanks for helping on this.
I rebooted this morning. Could the memory leak still effect me in that
short time?
Probably not. Just wanted to rule out this possibility.
No "failed allocation" in dmesg output.
Do you want the full output?
No, I dont think that is necessary.
[root at pengo log]# vmstat 4 20
And your system is mostly idle with no swap or disk activity.
[hobbit at pengo hobbit]$ server/bin/bb 127.0.0.1 "hobbitdboard"
2005-07-01 15:21:45 Whoops ! bb failed to send message - timeout
Could you try running "strace -p <process-ID of the hobbitd process>"
for a minute or two and send me the output, then do a "kill -6
<process-id>" and mail me the core-file from ~hobbit/server/tmp/
together with the ~hobbit/server/bin/hobbitd file ?

Also, after this try adding a "--debug" to the hobbitd commandline in
hobbitlaunch.cfg. Let it run for a while and then mail me the
hobbitd.log file.

This bug sounds a bit nasty, I think ....


Regards,
Henrik


_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

NOTICE: This message and any attachments are confidential and may contain copyright material 
of Australian Finance Group Limited or a third party. It is intended solely for the purpose of the 
addressee and any other named recipient. If you are not the intended recipient, any use, 
distribution, disclosure or copying of this message is strictly prohibited. The confidentiality attached
to this message is not waived or lost by reason of the mistaken transmission or delivery to any 
unintended party. If you have received this message in error, please notify the author immediately or 
contact Australian Finance Group on +61 8 9420 7888.
list Stefan Loos · Mon, 04 Jul 2005 06:16:21 +0000 ·
Hello Vernon,

can you tell me, if there is anything like "hobbitd status board not available" in the bb-display.log?

Regards,

Stefan

<br><br><br>&gt;From: &quot;Vernon Everett&quot; &lt;user-99fc6b22a3a3@xymon.invalid&gt;<br>&gt;Reply-To: user-ae9b8668bcde@xymon.invalid<br>&gt;To: &lt;user-ae9b8668bcde@xymon.invalid&gt;<br>&gt;Subject: RE: [hobbit] Status Unavailable<br>&gt;Date: Fri, 1 Jul 2005 16:56:38 +0800<br>&gt;<br>&gt;Hi Henrik<br>&gt;<br>&gt;It should be idle. All the system does is run hobbit. :-)<br>&gt;<br>&gt;Hobbitd is currently dead in the water.<br>&gt;	[root at pengo log]# strace -p 3025<br>&gt;	Process 3025 attached - interrupt to quit<br>&gt;	futex(0x40141b20, FUTEX_WAIT, 2, NULL<br>&gt;<br>&gt;And it's been like this a while.<br>&gt;When I did the kill -6 I got this.<br>&gt;	[root at pengo log]# strace -p 3025<br>&gt;	Process 3025 attached - interrupt to quit<br>&gt;	futex(0x40141b20, FUTEX_WAIT, 2, NULL)  = -1 EINTR (Interrupted<br>&gt;system call)<br>&gt;	--- SIGABRT (Aborted) @ 0 (0) ---<br>&gt;	Process 3025 detached<br>&gt;Which I suppose was expected :-)<br>&gt;<br>&gt;I restarted it, and got this.<br>&gt;	[root at pengo etc]# strace -p 9223<br>&gt;	Process 9223 attached - interrupt to quit<br>&gt;	semop(32769, 0xbfffe3a0, 1<br>&gt;Nope, there is nothing I forgot to cut and paste.<br>&gt;That really was it.<br>&gt;<br>&gt;And this shit just gets stranger and stranger.<br>&gt;It isn't dumping core.<br>&gt;I hit it with a kill -6 and nothing happens.<br>&gt;I then thought maybe we were both mistaken, and had the command wrong or<br>&gt;my linux was defaulted to not core, so I started vi in a session and did<br>&gt;a kill -6 on that. That dumped?!<br>&gt;Hobbit isn't dumping.<br>&gt;<br>&gt;I rebooted and tried again.<br>&gt;I managed to get a nice strace output - see attached - but still no damn<br>&gt;core.<br>&gt;<br>&gt;OK, I added debug, and restarted.<br>&gt;When I went to check the logs, I found this in hobbitlaunch.log.<br>&gt;---snip---<br>&gt;2005-07-01 16:37:21 Loading tasklist configuration from<br>&gt;/usr/lib/hobbit/server/etc/hobbitlaunch.cfg<br>&gt;2005-07-01 16:37:21 Loading hostnames<br>&gt;2005-07-01 16:37:21 Loading saved state<br>&gt;2005-07-01 16:37:21 Setting up network listener on 0.0.0.0:1984<br>&gt;2005-07-01 16:37:21 Cannot bind to listen socket (Address already in<br>&gt;use)<br>&gt;2005-07-01 16:37:21 Task hobbitd started with PID 4761<br>&gt;2005-07-01 16:37:26 Task hobbitd terminated, status 1<br>&gt;2005-07-01 16:37:26 Loading hostnames<br>&gt;2005-07-01 16:37:26 Loading saved state<br>&gt;2005-07-01 16:37:26 Task hobbitd started with PID 4765<br>&gt;2005-07-01 16:37:26 Setting up network listener on 0.0.0.0:1984<br>&gt;2005-07-01 16:37:26 Cannot bind to listen socket (Address already in<br>&gt;use)<br>&gt;2005-07-01 16:37:26 Task hobbitd terminated, status 1<br>&gt;2005-07-01 16:37:31 Loading hostnames<br>&gt;2005-07-01 16:37:31 Loading saved state<br>&gt;2005-07-01 16:37:31 Task hobbitd started with PID 4770<br>&gt;2005-07-01 16:37:31 Setting up network listener on 0.0.0.0:1984<br>&gt;2005-07-01 16:37:31 Cannot bind to listen socket (Address already in<br>&gt;use)<br>&gt;2005-07-01 16:37:31 Task hobbitd terminated, status 1<br>&gt;2005-07-01 16:37:36 Task hobbitd started with PID 4774<br>&gt;2005-07-01 16:37:36 Loading hostnames<br>&gt;2005-07-01 16:37:36 Loading saved state<br>&gt;2005-07-01 16:37:36 Setting up network listener on 0.0.0.0:1984<br>&gt;2005-07-01 16:37:36 Cannot bind to listen socket (Address already in<br>&gt;use)<br>&gt;2005-07-01 16:37:36 Task hobbitd terminated, status 1<br>&gt;2005-07-01 16:37:41 Task hobbitd started with PID 4778<br>&gt;2005-07-01 16:37:41 Loading hostnames<br>&gt;2005-07-01 16:37:41 Loading saved state<br>&gt;2005-07-01 16:37:41 Setting up network listener on 0.0.0.0:1984<br>&gt;2005-07-01 16:37:41 Cannot bind to listen socket (Address already in<br>&gt;use)<br>&gt;2005-07-01 16:37:41 Task hobbitd terminated, status 1<br>&gt;2005-07-01 16:37:46 Task hobbitd started with PID 4783<br>&gt;2005-07-01 16:37:46 Loading hostnames<br>&gt;2005-07-01 16:37:46 Loading saved state<br>&gt;2005-07-01 16:37:46 Setting up network listener on 0.0.0.0:1984<br>&gt;2005-07-01 16:37:46 Cannot bind to listen socket (Address already in<br>&gt;use)<br>&gt;2005-07-01 16:37:46 Task hobbitd terminated, status 1<br>&gt;---snip---<br>&gt;<br>&gt;Looks like a clue.<br>&gt;I will add the output of netstat -a<br>&gt;<br>&gt;Got the hobbitd.log file for you too.<br>&gt;<br>&gt;Let me know if there is anything else I can get you.<br>&gt;<br>&gt;Regards<br>&gt;     Vernon<br>&gt;<br>&gt;P.S. Your cold one is quickly becoming many cold ones if you ever get to<br>&gt;Perth<br>&gt;<br>&gt;<br>&gt;<br>&gt;<br>&gt;<br>&gt;-----Original Message-----<br>&gt;From: Henrik Stoerner [mailto:user-ce4a2c883f75@xymon.invalid]<br>&gt;Sent: Friday, 1 July 2005 3:38 PM<br>&gt;To: user-ae9b8668bcde@xymon.invalid<br>&gt;Subject: Re: [hobbit] Status Unavailable<br>&gt;<br>&gt;On Fri, Jul 01, 2005 at 03:25:30PM +0800, Vernon Everett wrote:<br>&gt; &gt; Thanks for helping on this.<br>&gt; &gt; I rebooted this morning. Could the memory leak still effect me in that<br>&gt;<br>&gt; &gt; short time?<br>&gt;<br>&gt;Probably not. Just wanted to rule out this possibility.<br>&gt;<br>&gt; &gt; No &quot;failed allocation&quot; in dmesg output.<br>&gt; &gt; Do you want the full output?<br>&gt;<br>&gt;No, I dont think that is necessary.<br>&gt;<br>&gt; &gt; [root at pengo log]# vmstat 4 20<br>&gt;<br>&gt;And your system is mostly idle with no swap or disk activity.<br>&gt;<br>&gt; &gt; [hobbit at pengo hobbit]$ server/bin/bb 127.0.0.1 &quot;hobbitdboard&quot;<br>&gt; &gt; 2005-07-01 15:21:45 Whoops ! bb failed to send message - timeout<br>&gt;<br>&gt;Could you try running &quot;strace -p &lt;process-ID of the hobbitd process&gt;&quot;<br>&gt;for a minute or two and send me the output, then do a &quot;kill -6<br>&gt;&lt;process-id&gt;&quot; and mail me the core-file from ~hobbit/server/tmp/<br>&gt;together with the ~hobbit/server/bin/hobbitd file ?<br>&gt;<br>&gt;Also, after this try adding a &quot;--debug&quot; to the hobbitd commandline in<br>&gt;hobbitlaunch.cfg. Let it run for a while and then mail me the<br>&gt;hobbitd.log file.<br>&gt;<br>&gt;This bug sounds a bit nasty, I think ....<br>&gt;<br>&gt;<br>&gt;Regards,<br>&gt;Henrik<br>&gt;<br>&gt;<br>&gt;To unsubscribe from the hobbit list, send an e-mail to<br>&gt;user-095ef1c764a2@xymon.invalid<br>&gt;<br>&gt;<br>&gt;_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _<br>&gt;<br>&gt;NOTICE: This message and any attachments are confidential and may contain copyright material<br>&gt;of Australian Finance Group Limited or a third party. It is intended solely for the purpose of the<br>&gt;addressee and any other named recipient. If you are not the intended recipient, any use,<br>&gt;distribution, disclosure or copying of this message is strictly prohibited. The confidentiality attached<br>&gt;to this message is not waived or lost by reason of the mistaken transmission or delivery to any<br>&gt;unintended party. If you have received this message in error, please notify the author immediately or<br>&gt;contact Australian Finance Group on +61 8 9420 7888.<br>&gt;<br>&gt;<br>&gt;To unsubscribe from the hobbit list, send an e-mail to<br>&gt;user-095ef1c764a2@xymon.invalid<br>&gt;<br>&gt;<br>
list Vernon Everett · Mon, 4 Jul 2005 14:23:56 +0800 ·
Yes.
Quite often.
---snip---
2005-07-04 14:09:17 Whoops ! bb failed to send message - timeout
2005-07-04 14:09:17 Could not get the Hobbit statuslog-list
2005-07-04 14:09:50 Whoops ! bb failed to send message - timeout
2005-07-04 14:09:50 hobbitd status-board not available
2005-07-04 14:10:49 Whoops ! bb failed to send message - timeout
2005-07-04 14:10:49 hobbitd status-board not available
2005-07-04 14:11:49 Whoops ! bb failed to send message - timeout
2005-07-04 14:11:49 hobbitd status-board not available
2005-07-04 14:12:52 Whoops ! bb failed to send message - timeout
2005-07-04 14:12:52 hobbitd status-board not available
2005-07-04 14:13:50 Whoops ! bb failed to send message - timeout
2005-07-04 14:13:50 hobbitd status-board not available
2005-07-04 14:14:50 Whoops ! bb failed to send message - timeout
2005-07-04 14:14:50 hobbitd status-board not available
2005-07-04 14:16:22 Whoops ! bb failed to send message - timeout
2005-07-04 14:16:22 hobbitd status-board not available
2005-07-04 14:16:22 WARNING: Runtime 61 longer than BBSLEEP (60)
2005-07-04 14:16:52 Whoops ! bb failed to send message - timeout
2005-07-04 14:16:52 hobbitd status-board not available
2005-07-04 14:17:52 Whoops ! bb failed to send message - timeout
2005-07-04 14:17:52 hobbitd status-board not available
2005-07-04 14:18:52 Whoops ! bb failed to send message - timeout
2005-07-04 14:18:52 hobbitd status-board not available
2005-07-04 14:19:52 Whoops ! bb failed to send message - timeout
2005-07-04 14:19:52 hobbitd status-board not available
2005-07-04 14:21:26 Whoops ! bb failed to send message - timeout
2005-07-04 14:21:26 hobbitd status-board not available
2005-07-04 14:21:26 WARNING: Runtime 61 longer than BBSLEEP (60)
2005-07-04 14:21:59 Whoops ! bb failed to send message - timeout
2005-07-04 14:21:59 hobbitd status-board not available
---snip---
quoted from Stefan Loos


-----Original Message-----
From: Stefan Loos [mailto:user-dea24d965402@xymon.invalid] 
Sent: Monday, 4 July 2005 2:16 PM
To: user-ae9b8668bcde@xymon.invalid
Subject: RE: [hobbit] Status Unavailable

Hello Vernon,

can you tell me, if there is anything like "hobbitd status board not
available" in the bb-display.log?

Regards,

Stefan

<br><br><br>&gt;From: &quot;Vernon Everett&quot;
&lt;user-99fc6b22a3a3@xymon.invalid&gt;<br>&gt;Reply-To: 
user-ae9b8668bcde@xymon.invalid<br>&gt;To: &lt;user-ae9b8668bcde@xymon.invalid&gt;<br>&gt;Subject: RE: 
[hobbit] Status Unavailable<br>&gt;Date: Fri, 1 Jul 2005 16:56:38 
+0800<br>&gt;<br>&gt;Hi Henrik<br>&gt;<br>&gt;It should be idle. All the
system does is run hobbit. :-)<br>&gt;<br>&gt;Hobbitd is currently dead
in 
the water.<br>&gt;	[root at pengo log]# strace -p 3025<br>&gt;
Process 3025 
attached - interrupt to quit<br>&gt;	futex(0x40141b20, FUTEX_WAIT, 2,

NULL<br>&gt;<br>&gt;And it's been like this a while.<br>&gt;When I did
the 
kill -6 I got this.<br>&gt;	[root at pengo log]# strace -p 3025<br>&gt;
Process 
3025 attached - interrupt to quit<br>&gt;	futex(0x40141b20,
FUTEX_WAIT, 2, 
NULL)  = -1 EINTR (Interrupted<br>&gt;system call)<br>&gt;	---
SIGABRT 
(Aborted) @ 0 (0) ---<br>&gt;	Process 3025 detached<br>&gt;Which I
suppose 
was expected :-)<br>&gt;<br>&gt;I restarted it, and got 
this.<br>&gt;	[root at pengo etc]# strace -p 9223<br>&gt;	Process
9223 attached 
- interrupt to quit<br>&gt;	semop(32769, 0xbfffe3a0, 1<br>&gt;Nope,
there is 
nothing I forgot to cut and paste.<br>&gt;That really was
it.<br>&gt;<br>&gt;And this shit just gets stranger and
stranger.<br>&gt;It isn't dumping core.<br>&gt;I hit it with a kill -6
and nothing happens.<br>&gt;I then thought maybe we were both mistaken,
and had the command wrong or<br>&gt;my linux was defaulted to not core,
so I started vi in a session and did<br>&gt;a kill -6 on that. That
dumped?!<br>&gt;Hobbit isn't dumping.<br>&gt;<br>&gt;I rebooted and
tried again.<br>&gt;I managed to get a nice strace output - see attached
- but still no damn<br>&gt;core.<br>&gt;<br>&gt;OK, I added debug, and
restarted.<br>&gt;When I went to check the logs, I found this in
hobbitlaunch.log.<br>&gt;---snip---<br>&gt;2005-07-01 16:37:21 Loading
tasklist configuration

from<br>&gt;/usr/lib/hobbit/server/etc/hobbitlaunch.cfg<br>&gt;2005-07-0
1
quoted from Stefan Loos
16:37:21 Loading hostnames<br>&gt;2005-07-01 16:37:21 Loading saved
state<br>&gt;2005-07-01 16:37:21 Setting up network listener on
0.0.0.0:1984<br>&gt;2005-07-01 16:37:21 Cannot bind to listen socket
(Address already in<br>&gt;use)<br>&gt;2005-07-01 16:37:21 Task hobbitd
started with PID 4761<br>&gt;2005-07-01 16:37:26 Task hobbitd
terminated, status 1<br>&gt;2005-07-01 16:37:26 Loading
hostnames<br>&gt;2005-07-01
16:37:26 Loading saved state<br>&gt;2005-07-01 16:37:26 Task hobbitd
started with PID 4765<br>&gt;2005-07-01 16:37:26 Setting up network
listener on
0.0.0.0:1984<br>&gt;2005-07-01 16:37:26 Cannot bind to listen socket
(Address already in<br>&gt;use)<br>&gt;2005-07-01 16:37:26 Task hobbitd
terminated, status 1<br>&gt;2005-07-01 16:37:31 Loading
hostnames<br>&gt;2005-07-01 16:37:31 Loading saved
state<br>&gt;2005-07-01
16:37:31 Task hobbitd started with PID 4770<br>&gt;2005-07-01 16:37:31
Setting up network listener on 0.0.0.0:1984<br>&gt;2005-07-01 16:37:31
Cannot bind to listen socket (Address already
in<br>&gt;use)<br>&gt;2005-07-01 16:37:31 Task hobbitd terminated,
status
1<br>&gt;2005-07-01 16:37:36 Task hobbitd started with PID
4774<br>&gt;2005-07-01 16:37:36 Loading hostnames<br>&gt;2005-07-01
16:37:36 Loading saved state<br>&gt;2005-07-01 16:37:36 Setting up
network listener on 0.0.0.0:1984<br>&gt;2005-07-01 16:37:36 Cannot bind
to listen socket (Address already in<br>&gt;use)<br>&gt;2005-07-01
16:37:36 Task hobbitd terminated, status 1<br>&gt;2005-07-01 16:37:41
Task hobbitd started with PID 4778<br>&gt;2005-07-01 16:37:41 Loading
hostnames<br>&gt;2005-07-01
16:37:41 Loading saved state<br>&gt;2005-07-01 16:37:41 Setting up
network listener on 0.0.0.0:1984<br>&gt;2005-07-01 16:37:41 Cannot bind
to listen socket (Address already in<br>&gt;use)<br>&gt;2005-07-01
16:37:41 Task hobbitd terminated, status 1<br>&gt;2005-07-01 16:37:46
Task hobbitd started with PID 4783<br>&gt;2005-07-01 16:37:46 Loading
hostnames<br>&gt;2005-07-01
16:37:46 Loading saved state<br>&gt;2005-07-01 16:37:46 Setting up
network listener on 0.0.0.0:1984<br>&gt;2005-07-01 16:37:46 Cannot bind
to listen socket (Address already in<br>&gt;use)<br>&gt;2005-07-01
16:37:46 Task hobbitd terminated, status
1<br>&gt;---snip---<br>&gt;<br>&gt;Looks like a clue.<br>&gt;I will add
the output of netstat -a<br>&gt;<br>&gt;Got the hobbitd.log file for you
too.<br>&gt;<br>&gt;Let me know if there is 
anything else I can get you.<br>&gt;<br>&gt;Regards<br>&gt;     
Vernon<br>&gt;<br>&gt;P.S. Your cold one is quickly becoming many cold
ones if you ever get

to<br>&gt;Perth<br>&gt;<br>&gt;<br>&gt;<br>&gt;<br>&gt;<br>&gt;-----Orig
inal
quoted from Stefan Loos
Message-----<br>&gt;From: Henrik Stoerner
[mailto:user-ce4a2c883f75@xymon.invalid]<br>&gt;Sent: Friday, 1 July 2005 3:38
PM<br>&gt;To: 
user-ae9b8668bcde@xymon.invalid<br>&gt;Subject: Re: [hobbit] Status
Unavailable<br>&gt;<br>&gt;On Fri, Jul 01, 2005 at 03:25:30PM +0800,
Vernon Everett wrote:<br>&gt; &gt; Thanks for helping on this.<br>&gt;
&gt; I rebooted this morning. Could the memory leak still effect me in
that<br>&gt;<br>&gt; &gt; short time?<br>&gt;<br>&gt;Probably not. Just
wanted to rule out this possibility.<br>&gt;<br>&gt; &gt; No
&quot;failed allocation&quot; in dmesg output.<br>&gt; &gt; Do you want
the full output?<br>&gt;<br>&gt;No, I dont think that is
necessary.<br>&gt;<br>&gt; &gt; [root at pengo log]# vmstat 4
20<br>&gt;<br>&gt;And your system is mostly idle with no swap or disk
activity.<br>&gt;<br>&gt; &gt; [hobbit at pengo hobbit]$ server/bin/bb
127.0.0.1 &quot;hobbitdboard&quot;<br>&gt; &gt;
2005-07-01 15:21:45 Whoops ! bb failed to send message -
timeout<br>&gt;<br>&gt;Could you try running &quot;strace -p
&lt;process-ID of the hobbitd process&gt;&quot;<br>&gt;for a minute or
two and send me the output, then do a &quot;kill
-6<br>&gt;&lt;process-id&gt;&quot; and mail me the core-file from
~hobbit/server/tmp/<br>&gt;together with the ~hobbit/server/bin/hobbitd
file ?<br>&gt;<br>&gt;Also, after this try adding a &quot;--debug&quot;
to the hobbitd commandline in<br>&gt;hobbitlaunch.cfg. 
Let it run for a while and then mail me the<br>&gt;hobbitd.log
file.<br>&gt;<br>&gt;This bug sounds a bit nasty, I think

....<br>&gt;<br>&gt;<br>&gt;Regards,<br>&gt;Henrik<br>&gt;<br>&gt;<br>&g
t;To
quoted from Stefan Loos
unsubscribe from the hobbit list, send an e-mail
to<br>&gt;user-095ef1c764a2@xymon.invalid<br>&gt;<br>&gt;<br>&gt;_ _ _ _ _ _
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
_<br>&gt;<br>&gt;NOTICE: This message and any attachments are
confidential and may contain copyright material<br>&gt;of Australian
Finance Group Limited or a third party. It is intended solely for the
purpose of the<br>&gt;addressee and any other named recipient. If you
are not the intended recipient, any use,<br>&gt;distribution, disclosure
or copying of this message is strictly prohibited. The confidentiality
attached<br>&gt;to this message is not waived or lost by reason of the
mistaken transmission or delivery to any<br>&gt;unintended party. If you
have received this message in error, please notify the author
immediately or<br>&gt;contact Australian Finance Group on +61 8 9420
7888.<br>&gt;<br>&gt;<br>&gt;To unsubscribe from the hobbit list, send
an e-mail to<br>&gt;user-095ef1c764a2@xymon.invalid<br>&gt;<br>&gt;<br>


_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

NOTICE: This message and any attachments are confidential and may contain copyright material 
of Australian Finance Group Limited or a third party. It is intended solely for the purpose of the 
addressee and any other named recipient. If you are not the intended recipient, any use, 
distribution, disclosure or copying of this message is strictly prohibited. The confidentiality attached
to this message is not waived or lost by reason of the mistaken transmission or delivery to any 
unintended party. If you have received this message in error, please notify the author immediately or 
contact Australian Finance Group on +61 8 9420 7888.
list Stefan Loos · Mon, 04 Jul 2005 06:42:37 +0000 ·
And can you stop the hobbit server with hobbit.sh or is one process still running after that?

<br><br><br>&gt;From: &quot;Vernon Everett&quot; &lt;user-99fc6b22a3a3@xymon.invalid&gt;<br>&gt;Reply-To: user-ae9b8668bcde@xymon.invalid<br>&gt;To: &lt;user-ae9b8668bcde@xymon.invalid&gt;<br>&gt;Subject: RE: [hobbit] Status Unavailable<br>&gt;Date: Mon, 4 Jul 2005 14:23:56 +0800<br>&gt;<br>&gt;Yes.<br>&gt;Quite often.<br>&gt;---snip---<br>&gt;2005-07-04 14:09:17 Whoops ! bb failed to send message - timeout<br>&gt;2005-07-04 14:09:17 Could not get the Hobbit statuslog-list<br>&gt;2005-07-04 14:09:50 Whoops ! bb failed to send message - timeout<br>&gt;2005-07-04 14:09:50 hobbitd status-board not available<br>&gt;2005-07-04 14:10:49 Whoops ! bb failed to send message - timeout<br>&gt;2005-07-04 14:10:49 hobbitd status-board not available<br>&gt;2005-07-04 14:11:49 Whoops ! bb failed to send message - timeout<br>&gt;2005-07-04 14:11:49 hobbitd status-board not available<br>&gt;2005-07-04 14:12:52 Whoops ! bb failed to send message - timeout<br>&gt;2005-07-04 14:12:52 hobbitd status-board not available<br>&gt;2005-07-04 14:13:50 Whoops ! bb failed to send message - timeout<br>&gt;2005-07-04 14:13:50 hobbitd status-board not available<br>&gt;2005-07-04 14:14:50 Whoops ! bb failed to send message - timeout<br>&gt;2005-07-04 14:14:50 hobbitd status-board not available<br>&gt;2005-07-04 14:16:22 Whoops ! bb failed to send message - timeout<br>&gt;2005-07-04 14:16:22 hobbitd status-board not available<br>&gt;2005-07-04 14:16:22 WARNING: Runtime 61 longer than BBSLEEP (60)<br>&gt;2005-07-04 14:16:52 Whoops ! bb failed to send message - timeout<br>&gt;2005-07-04 14:16:52 hobbitd status-board not available<br>&gt;2005-07-04 14:17:52 Whoops ! bb failed to send message - timeout<br>&gt;2005-07-04 14:17:52 hobbitd status-board not available<br>&gt;2005-07-04 14:18:52 Whoops ! bb failed to send message - timeout<br>&gt;2005-07-04 14:18:52 hobbitd status-board not available<br>&gt;2005-07-04 14:19:52 Whoops ! bb failed to send message - timeout<br>&gt;2005-07-04 14:19:52 hobbitd status-board not available<br>&gt;2005-07-04 14:21:26 Whoops ! bb failed to send message - timeout<br>&gt;2005-07-04 14:21:26 hobbitd status-board not available<br>&gt;2005-07-04 14:21:26 WARNING: Runtime 61 longer than BBSLEEP (60)<br>&gt;2005-07-04 14:21:59 Whoops ! bb failed to send message - timeout<br>&gt;2005-07-04 14:21:59 hobbitd status-board not available<br>&gt;---snip---<br>&gt;<br>&gt;<br>&gt;-----Original Message-----<br>&gt;From: Stefan Loos [mailto:user-dea24d965402@xymon.invalid]<br>&gt;Sent: Monday, 4 July 2005 2:16 PM<br>&gt;To: user-ae9b8668bcde@xymon.invalid<br>&gt;Subject: RE: [hobbit] Status Unavailable<br>&gt;<br>&gt;Hello Vernon,<br>&gt;<br>&gt;can you tell me, if there is anything like &quot;hobbitd status board not<br>&gt;available&quot; in the bb-display.log?<br>&gt;<br>&gt;Regards,<br>&gt;<br>&gt;Stefan<br>&gt;<br>&gt;&lt;br&gt;&lt;br&gt;&lt;br&gt;&amp;gt;From: &amp;quot;Vernon Everett&amp;quot;<br>&gt;&amp;lt;user-99fc6b22a3a3@xymon.invalid&amp;gt;&lt;br&gt;&amp;gt;Reply-To:<br>&gt;user-ae9b8668bcde@xymon.invalid&lt;br&gt;&amp;gt;To: &amp;lt;user-ae9b8668bcde@xymon.invalid&amp;gt;&lt;br&gt;&amp;gt;Subject: RE:<br>&gt;[hobbit] Status Unavailable&lt;br&gt;&amp;gt;Date: Fri, 1 Jul 2005 16:56:38<br>&gt;+0800&lt;br&gt;&amp;gt;&lt;br&gt;&amp;gt;Hi Henrik&lt;br&gt;&amp;gt;&lt;br&gt;&amp;gt;It should be idle. All the<br>&gt;system does is run hobbit. :-)&lt;br&gt;&amp;gt;&lt;br&gt;&amp;gt;Hobbitd is currently dead<br>&gt;in<br>&gt;the water.&lt;br&gt;&amp;gt;	[root at pengo log]# strace -p 3025&lt;br&gt;&amp;gt;<br>&gt;Process 3025<br>&gt;attached - interrupt to quit&lt;br&gt;&amp;gt;	futex(0x40141b20, FUTEX_WAIT, 2,<br>&gt;<br>&gt;NULL&lt;br&gt;&amp;gt;&lt;br&gt;&amp;gt;And it's been like this a while.&lt;br&gt;&amp;gt;When I did<br>&gt;the<br>&gt;kill -6 I got this.&lt;br&gt;&amp;gt;	[root at pengo log]# strace -p 3025&lt;br&gt;&amp;gt;<br>&gt;Process<br>&gt;3025 attached - interrupt to quit&lt;br&gt;&amp;gt;	futex(0x40141b20,<br>&gt;FUTEX_WAIT, 2,<br>&gt;NULL)  = -1 EINTR (Interrupted&lt;br&gt;&amp;gt;system call)&lt;br&gt;&amp;gt;	---<br>&gt;SIGABRT<br>&gt;(Aborted) @ 0 (0) ---&lt;br&gt;&amp;gt;	Process 3025 detached&lt;br&gt;&amp;gt;Which I<br>&gt;suppose<br>&gt;was expected :-)&lt;br&gt;&amp;gt;&lt;br&gt;&amp;gt;I restarted it, and got<br>&gt;this.&lt;br&gt;&amp;gt;	[root at pengo etc]# strace -p 9223&lt;br&gt;&amp;gt;	Process<br>&gt;9223 attached<br>&gt;- interrupt to quit&lt;br&gt;&amp;gt;	semop(32769, 0xbfffe3a0, 1&lt;br&gt;&amp;gt;Nope,<br>&gt;there is<br>&gt;nothing I forgot to cut and paste.&lt;br&gt;&amp;gt;That really was<br>&gt;it.&lt;br&gt;&amp;gt;&lt;br&gt;&amp;gt;And this shit just gets stranger and<br>&gt;stranger.&lt;br&gt;&amp;gt;It isn't dumping core.&lt;br&gt;&amp;gt;I hit it with a kill -6<br>&gt;and nothing happens.&lt;br&gt;&amp;gt;I then thought maybe we were both mistaken,<br>&gt;and had the command wrong or&lt;br&gt;&amp;gt;my linux was defaulted to not core,<br>&gt;so I started vi in a session and did&lt;br&gt;&amp;gt;a kill -6 on that. That<br>&gt;dumped?!&lt;br&gt;&amp;gt;Hobbit isn't dumping.&lt;br&gt;&amp;gt;&lt;br&gt;&amp;gt;I rebooted and<br>&gt;tried again.&lt;br&gt;&amp;gt;I managed to get a nice strace output - see attached<br>&gt;- but still no damn&lt;br&gt;&amp;gt;core.&lt;br&gt;&amp;gt;&lt;br&gt;&amp;gt;OK, I added debug, and<br>&gt;restarted.&lt;br&gt;&amp;gt;When I went to check the logs, I found this in<br>&gt;hobbitlaunch.log.&lt;br&gt;&amp;gt;---snip---&lt;br&gt;&amp;gt;2005-07-01 16:37:21 Loading<br>&gt;tasklist configuration<br>&gt;from&lt;br&gt;&amp;gt;/usr/lib/hobbit/server/etc/hobbitlaunch.cfg&lt;br&gt;&amp;gt;2005-07-0<br>&gt;1<br>&gt;16:37:21 Loading hostnames&lt;br&gt;&amp;gt;2005-07-01 16:37:21 Loading saved<br>&gt;state&lt;br&gt;&amp;gt;2005-07-01 16:37:21 Setting up network listener on<br>&gt;0.0.0.0:1984&lt;br&gt;&amp;gt;2005-07-01 16:37:21 Cannot bind to listen socket<br>&gt;(Address already in&lt;br&gt;&amp;gt;use)&lt;br&gt;&amp;gt;2005-07-01 16:37:21 Task hobbitd<br>&gt;started with PID 4761&lt;br&gt;&amp;gt;2005-07-01 16:37:26 Task hobbitd<br>&gt;terminated, status 1&lt;br&gt;&amp;gt;2005-07-01 16:37:26 Loading<br>&gt;hostnames&lt;br&gt;&amp;gt;2005-07-01<br>&gt;16:37:26 Loading saved state&lt;br&gt;&amp;gt;2005-07-01 16:37:26 Task hobbitd<br>&gt;started with PID 4765&lt;br&gt;&amp;gt;2005-07-01 16:37:26 Setting up network<br>&gt;listener on<br>&gt;0.0.0.0:1984&lt;br&gt;&amp;gt;2005-07-01 16:37:26 Cannot bind to listen socket<br>&gt;(Address already in&lt;br&gt;&amp;gt;use)&lt;br&gt;&amp;gt;2005-07-01 16:37:26 Task hobbitd<br>&gt;terminated, status 1&lt;br&gt;&amp;gt;2005-07-01 16:37:31 Loading<br>&gt;hostnames&lt;br&gt;&amp;gt;2005-07-01 16:37:31 Loading saved<br>&gt;state&lt;br&gt;&amp;gt;2005-07-01<br>&gt;16:37:31 Task hobbitd started with PID 4770&lt;br&gt;&amp;gt;2005-07-01 16:37:31<br>&gt;Setting up network listener on 0.0.0.0:1984&lt;br&gt;&amp;gt;2005-07-01 16:37:31<br>&gt;Cannot bind to listen socket (Address already<br>&gt;in&lt;br&gt;&amp;gt;use)&lt;br&gt;&amp;gt;2005-07-01 16:37:31 Task hobbitd terminated,<br>&gt;status<br>&gt;1&lt;br&gt;&amp;gt;2005-07-01 16:37:36 Task hobbitd started with PID<br>&gt;4774&lt;br&gt;&amp;gt;2005-07-01 16:37:36 Loading hostnames&lt;br&gt;&amp;gt;2005-07-01<br>&gt;16:37:36 Loading saved state&lt;br&gt;&amp;gt;2005-07-01 16:37:36 Setting up<br>&gt;network listener on 0.0.0.0:1984&lt;br&gt;&amp;gt;2005-07-01 16:37:36 Cannot bind<br>&gt;to listen socket (Address already in&lt;br&gt;&amp;gt;use)&lt;br&gt;&amp;gt;2005-07-01<br>&gt;16:37:36 Task hobbitd terminated, status 1&lt;br&gt;&amp;gt;2005-07-01 16:37:41<br>&gt;Task hobbitd started with PID 4778&lt;br&gt;&amp;gt;2005-07-01 16:37:41 Loading<br>&gt;hostnames&lt;br&gt;&amp;gt;2005-07-01<br>&gt;16:37:41 Loading saved state&lt;br&gt;&amp;gt;2005-07-01 16:37:41 Setting up<br>&gt;network listener on 0.0.0.0:1984&lt;br&gt;&amp;gt;2005-07-01 16:37:41 Cannot bind<br>&gt;to listen socket (Address already in&lt;br&gt;&amp;gt;use)&lt;br&gt;&amp;gt;2005-07-01<br>&gt;16:37:41 Task hobbitd terminated, status 1&lt;br&gt;&amp;gt;2005-07-01 16:37:46<br>&gt;Task hobbitd started with PID 4783&lt;br&gt;&amp;gt;2005-07-01 16:37:46 Loading<br>&gt;hostnames&lt;br&gt;&amp;gt;2005-07-01<br>&gt;16:37:46 Loading saved state&lt;br&gt;&amp;gt;2005-07-01 16:37:46 Setting up<br>&gt;network listener on 0.0.0.0:1984&lt;br&gt;&amp;gt;2005-07-01 16:37:46 Cannot bind<br>&gt;to listen socket (Address already in&lt;br&gt;&amp;gt;use)&lt;br&gt;&amp;gt;2005-07-01<br>&gt;16:37:46 Task hobbitd terminated, status<br>&gt;1&lt;br&gt;&amp;gt;---snip---&lt;br&gt;&amp;gt;&lt;br&gt;&amp;gt;Looks like a clue.&lt;br&gt;&amp;gt;I will add<br>&gt;the output of netstat -a&lt;br&gt;&amp;gt;&lt;br&gt;&amp;gt;Got the hobbitd.log file for you<br>&gt;too.&lt;br&gt;&amp;gt;&lt;br&gt;&amp;gt;Let me know if there is<br>&gt;anything else I can get you.&lt;br&gt;&amp;gt;&lt;br&gt;&amp;gt;Regards&lt;br&gt;&amp;gt;<br>&gt;Vernon&lt;br&gt;&amp;gt;&lt;br&gt;&amp;gt;P.S. Your cold one is quickly becoming many cold<br>&gt;ones if you ever get<br>&gt;to&lt;br&gt;&amp;gt;Perth&lt;br&gt;&amp;gt;&lt;br&gt;&amp;gt;&lt;br&gt;&amp;gt;&lt;br&gt;&amp;gt;&lt;br&gt;&amp;gt;&lt;br&gt;&amp;gt;-----Orig<br>&gt;inal<br>&gt;Message-----&lt;br&gt;&amp;gt;From: Henrik Stoerner<br>&gt;[mailto:user-ce4a2c883f75@xymon.invalid]&lt;br&gt;&amp;gt;Sent: Friday, 1 July 2005 3:38<br>&gt;PM&lt;br&gt;&amp;gt;To:<br>&gt;user-ae9b8668bcde@xymon.invalid&lt;br&gt;&amp;gt;Subject: Re: [hobbit] Status<br>&gt;Unavailable&lt;br&gt;&amp;gt;&lt;br&gt;&amp;gt;On Fri, Jul 01, 2005 at 03:25:30PM +0800,<br>&gt;Vernon Everett wrote:&lt;br&gt;&amp;gt; &amp;gt; Thanks for helping on this.&lt;br&gt;&amp;gt;<br>&gt;&amp;gt; I rebooted this morning. Could the memory leak still effect me in<br>&gt;that&lt;br&gt;&amp;gt;&lt;br&gt;&amp;gt; &amp;gt; short time?&lt;br&gt;&amp;gt;&lt;br&gt;&amp;gt;Probably not. Just<br>&gt;wanted to rule out this possibility.&lt;br&gt;&amp;gt;&lt;br&gt;&amp;gt; &amp;gt; No<br>&gt;&amp;quot;failed allocation&amp;quot; in dmesg output.&lt;br&gt;&amp;gt; &amp;gt; Do you want<br>&gt;the full output?&lt;br&gt;&amp;gt;&lt;br&gt;&amp;gt;No, I dont think that is<br>&gt;necessary.&lt;br&gt;&amp;gt;&lt;br&gt;&amp;gt; &amp;gt; [root at pengo log]# vmstat 4<br>&gt;20&lt;br&gt;&amp;gt;&lt;br&gt;&amp;gt;And your system is mostly idle with no swap or disk<br>&gt;activity.&lt;br&gt;&amp;gt;&lt;br&gt;&amp;gt; &amp;gt; [hobbit at pengo hobbit]$ server/bin/bb<br>&gt;127.0.0.1 &amp;quot;hobbitdboard&amp;quot;&lt;br&gt;&amp;gt; &amp;gt;<br>&gt;2005-07-01 15:21:45 Whoops ! bb failed to send message -<br>&gt;timeout&lt;br&gt;&amp;gt;&lt;br&gt;&amp;gt;Could you try running &amp;quot;strace -p<br>&gt;&amp;lt;process-ID of the hobbitd process&amp;gt;&amp;quot;&lt;br&gt;&amp;gt;for a minute or<br>&gt;two and send me the output, then do a &amp;quot;kill<br>&gt;-6&lt;br&gt;&amp;gt;&amp;lt;process-id&amp;gt;&amp;quot; and mail me the core-file from<br>&gt;~hobbit/server/tmp/&lt;br&gt;&amp;gt;together with the ~hobbit/server/bin/hobbitd<br>&gt;file ?&lt;br&gt;&amp;gt;&lt;br&gt;&amp;gt;Also, after this try adding a &amp;quot;--debug&amp;quot;<br>&gt;to the hobbitd commandline in&lt;br&gt;&amp;gt;hobbitlaunch.cfg.<br>&gt;Let it run for a while and then mail me the&lt;br&gt;&amp;gt;hobbitd.log<br>&gt;file.&lt;br&gt;&amp;gt;&lt;br&gt;&amp;gt;This bug sounds a bit nasty, I think<br>&gt;....&lt;br&gt;&amp;gt;&lt;br&gt;&amp;gt;&lt;br&gt;&amp;gt;Regards,&lt;br&gt;&amp;gt;Henrik&lt;br&gt;&amp;gt;&lt;br&gt;&amp;gt;&lt;br&gt;&amp;g<br>&gt;t;To<br>&gt;unsubscribe from the hobbit list, send an e-mail<br>&gt;to&lt;br&gt;&amp;gt;user-095ef1c764a2@xymon.invalid&lt;br&gt;&amp;gt;&lt;br&gt;&amp;gt;&lt;br&gt;&amp;gt;_ _ _ _ _ _<br>&gt;_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _<br>&gt;_&lt;br&gt;&amp;gt;&lt;br&gt;&amp;gt;NOTICE: This message and any attachments are<br>&gt;confidential and may contain copyright material&lt;br&gt;&amp;gt;of Australian<br>&gt;Finance Group Limited or a third party. It is intended solely for the<br>&gt;purpose of the&lt;br&gt;&amp;gt;addressee and any other named recipient. If you<br>&gt;are not the intended recipient, any use,&lt;br&gt;&amp;gt;distribution, disclosure<br>&gt;or copying of this message is strictly prohibited. The confidentiality<br>&gt;attached&lt;br&gt;&amp;gt;to this message is not waived or lost by reason of the<br>&gt;mistaken transmission or delivery to any&lt;br&gt;&amp;gt;unintended party. If you<br>&gt;have received this message in error, please notify the author<br>&gt;immediately or&lt;br&gt;&amp;gt;contact Australian Finance Group on +61 8 9420<br>&gt;7888.&lt;br&gt;&amp;gt;&lt;br&gt;&amp;gt;&lt;br&gt;&amp;gt;To unsubscribe from the hobbit list, send<br>&gt;an e-mail to&lt;br&gt;&amp;gt;user-095ef1c764a2@xymon.invalid&lt;br&gt;&amp;gt;&lt;br&gt;&amp;gt;&lt;br&gt;<br>&gt;<br>&gt;<br>&gt;<br>&gt;To unsubscribe from the hobbit list, send an e-mail to<br>&gt;user-095ef1c764a2@xymon.invalid<br>&gt;<br>&gt;<br>&gt;_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _<br>&gt;<br>&gt;NOTICE: This message and any attachments are confidential and may contain copyright material<br>&gt;of Australian Finance Group Limited or a third party. It is intended solely for the purpose of the<br>&gt;addressee and any other named recipient. If you are not the intended recipient, any use,<br>&gt;distribution, disclosure or copying of this message is strictly prohibited. The confidentiality attached<br>&gt;to this message is not waived or lost by reason of the mistaken transmission or delivery to any<br>&gt;unintended party. If you have received this message in error, please notify the author immediately or<br>&gt;contact Australian Finance Group on +61 8 9420 7888.<br>&gt;<br>&gt;<br>&gt;To unsubscribe from the hobbit list, send an e-mail to<br>&gt;user-095ef1c764a2@xymon.invalid<br>&gt;<br>&gt;<br>
list Vernon Everett · Mon, 4 Jul 2005 14:48:12 +0800 ·
Never actually checked that.
Nope, hobbitd refuses to die.

You got any ideas?
quoted from Stefan Loos

 
-----Original Message-----
From: Stefan Loos [mailto:user-dea24d965402@xymon.invalid] Sent: Monday, 4 July 2005 2:43 PM
To: user-ae9b8668bcde@xymon.invalid
Subject: RE: [hobbit] Status Unavailable

And can you stop the hobbit server with hobbit.sh or is one process
still running after that?

_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

NOTICE: This message and any attachments are confidential and may contain copyright material of Australian Finance Group Limited or a third party. It is intended solely for the purpose of the addressee and any other named recipient. If you are not the intended recipient, any use, distribution, disclosure or copying of this message is strictly prohibited. The confidentiality attached
to this message is not waived or lost by reason of the mistaken transmission or delivery to any unintended party. If you have received this message in error, please notify the author immediately or contact Australian Finance Group on +61 8 9420 7888.
list Stefan Loos · Mon, 04 Jul 2005 07:04:02 +0000 ·
Hello Vernon,

I think you see the same problem that I reported to Hendrid in May (bbdisplay problems after adding some hosts).
The only difference is that my hobbit server runs normal as long as no other hosts are added to bb-hosts.
I wanted to find out if its a OS problem (we use RedHat Enterprise 3) and installed hobbit on a SuSE system. It shows the same problems....

Regards,

Stefan

<br><br><br>&gt;From: &quot;Vernon Everett&quot; &lt;user-99fc6b22a3a3@xymon.invalid&gt;<br>&gt;Reply-To: user-ae9b8668bcde@xymon.invalid<br>&gt;To: &lt;user-ae9b8668bcde@xymon.invalid&gt;<br>&gt;Subject: RE: [hobbit] Status Unavailable<br>&gt;Date: Mon, 4 Jul 2005 14:48:12 +0800<br>&gt;<br>&gt;Never actually checked that.<br>&gt;Nope, hobbitd refuses to die.<br>&gt;<br>&gt;You got any ideas?<br>&gt;<br>&gt;<br>&gt;<br>&gt;-----Original Message-----<br>&gt;From: Stefan Loos [mailto:user-dea24d965402@xymon.invalid]<br>&gt;Sent: Monday, 4 July 2005 2:43 PM<br>&gt;To: user-ae9b8668bcde@xymon.invalid<br>&gt;Subject: RE: [hobbit] Status Unavailable<br>&gt;<br>&gt;And can you stop the hobbit server with hobbit.sh or is one process<br>&gt;still running after that?<br>&gt;<br>&gt;_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _<br>&gt;<br>&gt;NOTICE: This message and any attachments are confidential and may contain copyright material<br>&gt;of Australian Finance Group Limited or a third party. It is intended solely for the purpose of the<br>&gt;addressee and any other named recipient. If you are not the intended recipient, any use,<br>&gt;distribution, disclosure or copying of this message is strictly prohibited. The confidentiality attached<br>&gt;to this message is not waived or lost by reason of the mistaken transmission or delivery to any<br>&gt;unintended party. If you have received this message in error, please notify the author immediately or<br>&gt;contact Australian Finance Group on +61 8 9420 7888.<br>&gt;<br>&gt;<br>&gt;To unsubscribe from the hobbit list, send an e-mail to<br>&gt;user-095ef1c764a2@xymon.invalid<br>&gt;<br>&gt;<br>
list Vernon Everett · Mon, 4 Jul 2005 15:39:20 +0800 ·
I think mine started after I removed a test, but I am not sure.
It might have happened before. 
quoted from Stefan Loos

-----Original Message-----
From: Stefan Loos [mailto:user-dea24d965402@xymon.invalid] 
Sent: Monday, 4 July 2005 3:04 PM
To: user-ae9b8668bcde@xymon.invalid
Subject: RE: [hobbit] Status Unavailable

Hello Vernon,

I think you see the same problem that I reported to Hendrid in May
(bbdisplay problems after adding some hosts).
The only difference is that my hobbit server runs normal as long as no
other hosts are added to bb-hosts.
I wanted to find out if its a OS problem (we use RedHat Enterprise 3)
and installed hobbit on a SuSE system. It shows the same problems....

Regards,

Stefan

<br><br><br>&gt;From: &quot;Vernon Everett&quot;
&lt;user-99fc6b22a3a3@xymon.invalid&gt;<br>&gt;Reply-To: 
user-ae9b8668bcde@xymon.invalid<br>&gt;To: &lt;user-ae9b8668bcde@xymon.invalid&gt;<br>&gt;Subject: RE: 
[hobbit] Status Unavailable<br>&gt;Date: Mon, 4 Jul 2005 14:48:12 
+0800<br>&gt;<br>&gt;Never actually checked that.<br>&gt;Nope, hobbitd
refuses to die.<br>&gt;<br>&gt;You got any
ideas?<br>&gt;<br>&gt;<br>&gt;<br>&gt;-----Original
Message-----<br>&gt;From: Stefan Loos
[mailto:user-dea24d965402@xymon.invalid]<br>&gt;Sent: Monday, 4 July 2005 2:43
PM<br>&gt;To: user-ae9b8668bcde@xymon.invalid<br>&gt;Subject: RE: [hobbit] Status
Unavailable<br>&gt;<br>&gt;And can you stop the hobbit server with
hobbit.sh or is one process<br>&gt;still running after
that?<br>&gt;<br>&gt;_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
_ _ _ _ _ _ _ _ _ _ _
_<br>&gt;<br>&gt;NOTICE: This message and any attachments are
confidential and may contain copyright material<br>&gt;of Australian
Finance Group Limited or a third party. It is intended solely for the
purpose of the<br>&gt;addressee and any other named recipient. If you
are not the intended recipient, any use,<br>&gt;distribution, disclosure
or copying of this message is strictly prohibited. The confidentiality
attached<br>&gt;to this message is not waived or lost by reason of the
mistaken transmission or delivery to any<br>&gt;unintended party. If you
have received this message in error, please notify the author
immediately or<br>&gt;contact Australian Finance Group on +61 8 9420
7888.<br>&gt;<br>&gt;<br>&gt;To unsubscribe from the hobbit list, send
an e-mail to<br>&gt;user-095ef1c764a2@xymon.invalid<br>&gt;<br>&gt;<br>


_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

NOTICE: This message and any attachments are confidential and may contain copyright material 
of Australian Finance Group Limited or a third party. It is intended solely for the purpose of the 
addressee and any other named recipient. If you are not the intended recipient, any use, 
distribution, disclosure or copying of this message is strictly prohibited. The confidentiality attached
to this message is not waived or lost by reason of the mistaken transmission or delivery to any 
unintended party. If you have received this message in error, please notify the author immediately or 
contact Australian Finance Group on +61 8 9420 7888.
list Vernon Everett · Mon, 4 Jul 2005 16:00:41 +0800 ·
Hi Stefan Did you ever get your Hobbit up and running again? 
quoted from Stefan Loos
-----Original Message-----
From: Stefan Loos [mailto:user-dea24d965402@xymon.invalid] Sent: Monday, 4 July 2005 3:04 PM
To: user-ae9b8668bcde@xymon.invalid
Subject: RE: [hobbit] Status Unavailable

Hello Vernon,

I think you see the same problem that I reported to Hendrid in May
(bbdisplay problems after adding some hosts).
The only difference is that my hobbit server runs normal as long as no
other hosts are added to bb-hosts.
I wanted to find out if its a OS problem (we use RedHat Enterprise 3)
and installed hobbit on a SuSE system. It shows the same problems....

Regards,

Stefan
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

NOTICE: This message and any attachments are confidential and may contain copyright material of Australian Finance Group Limited or a third party. It is intended solely for the purpose of the addressee and any other named recipient. If you are not the intended recipient, any use, distribution, disclosure or copying of this message is strictly prohibited. The confidentiality attached
to this message is not waived or lost by reason of the mistaken transmission or delivery to any unintended party. If you have received this message in error, please notify the author immediately or contact Australian Finance Group on +61 8 9420 7888.
list Henrik Størner · Mon, 4 Jul 2005 10:10:21 +0200 ·
Hi Vernon,

could you try disabling the [larrdcolumn] and [infocolumn] tasks in your
hobbitlaunch.cfg ? These are no longer used, and I suspect that the code 
in hobbitd that handles these messages is somewhat flaky.

If hobbitd refuses to stop, you'll have to do it with a "kill -9". When
you do that, please check that there are no ressources left allocated
by hobbitd:

  login as the hobbit user
  run "ipcs -m" and "ipcs -s"

If you see something like

  hobbit at osiris:~$ ipcs -m

  ------ Shared Memory Segments --------
  key        shmid      owner      perms      bytes      nattch     status
  0x01020b1f 482148354  hobbit    600        102400     2
  0x02020b1f 482181123  hobbit    600        102400     2
  0x03020b1f 482213892  hobbit    600        102400     2
  0x04020b1f 482246661  hobbit    600        102400     2
  0x05020b1f 482279430  hobbit    600        102400     1
  0x06020b1f 482312199  hobbit    600        102400     1
  
  hobbit at osiris:~$ ipcs -s

  ------ Semaphore Arrays --------
  key        semid      owner      perms      nsems
  0x01020b1f 3276801    hobbit    600        3
  0x02020b1f 3309570    hobbit    600        3
  0x03020b1f 3342339    hobbit    600        3
  0x04020b1f 3375108    hobbit    600        3
  0x05020b1f 3407877    hobbit    600        3
  0x06020b1f 3440646    hobbit    600        3

then it hasn't cleaned up properly and you should either reboot the box
or use "ipcrm -m <shmid>" and "ipcrm -s <semid>" to delete these.


Regards,
Henrik
list Vernon Everett · Mon, 4 Jul 2005 16:30:48 +0800 ·
Nope!
Hashed out the [larrdcolumn] and [infocolumn] sections.
Had a look. Saw your shared mem segments.
Didn't even look for semaphores.
Decided a reboot is probably easier.
Hobbit started on its own. 
It looked promising, and then after about 2 minutes, back to its usual
crap.

Green background. Header & footer only :-(
quoted from Henrik Størner


-----Original Message-----
From: Henrik Stoerner [mailto:user-ce4a2c883f75@xymon.invalid] 
Sent: Monday, 4 July 2005 4:10 PM
To: user-ae9b8668bcde@xymon.invalid
Subject: Re: [hobbit] Status Unavailable

Hi Vernon,

could you try disabling the [larrdcolumn] and [infocolumn] tasks in your
hobbitlaunch.cfg ? These are no longer used, and I suspect that the code
in hobbitd that handles these messages is somewhat flaky.

If hobbitd refuses to stop, you'll have to do it with a "kill -9". When
you do that, please check that there are no ressources left allocated by
hobbitd:

  login as the hobbit user
  run "ipcs -m" and "ipcs -s"

If you see something like

  hobbit at osiris:~$ ipcs -m

  ------ Shared Memory Segments --------
  key        shmid      owner      perms      bytes      nattch
status
  0x01020b1f 482148354  hobbit    600        102400     2
  0x02020b1f 482181123  hobbit    600        102400     2
  0x03020b1f 482213892  hobbit    600        102400     2
  0x04020b1f 482246661  hobbit    600        102400     2
  0x05020b1f 482279430  hobbit    600        102400     1
  0x06020b1f 482312199  hobbit    600        102400     1
  
  hobbit at osiris:~$ ipcs -s

  ------ Semaphore Arrays --------
  key        semid      owner      perms      nsems
  0x01020b1f 3276801    hobbit    600        3
  0x02020b1f 3309570    hobbit    600        3
  0x03020b1f 3342339    hobbit    600        3
  0x04020b1f 3375108    hobbit    600        3
  0x05020b1f 3407877    hobbit    600        3
  0x06020b1f 3440646    hobbit    600        3

then it hasn't cleaned up properly and you should either reboot the box
or use "ipcrm -m <shmid>" and "ipcrm -s <semid>" to delete these.


Regards,
Henrik


_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

NOTICE: This message and any attachments are confidential and may contain copyright material 
of Australian Finance Group Limited or a third party. It is intended solely for the purpose of the 
addressee and any other named recipient. If you are not the intended recipient, any use, 
distribution, disclosure or copying of this message is strictly prohibited. The confidentiality attached
to this message is not waived or lost by reason of the mistaken transmission or delivery to any 
unintended party. If you have received this message in error, please notify the author immediately or 
contact Australian Finance Group on +61 8 9420 7888.
list Stefan Loos · Mon, 04 Jul 2005 08:32:35 +0000 ·
Hi Vernon,

if I disable sending messages to the hobbit-server (block port  1984) then everything seems to run. But that doesn't make sense with a monitoring server!


<br><br><br>&gt;From: &quot;Vernon Everett&quot; &lt;user-99fc6b22a3a3@xymon.invalid&gt;<br>&gt;Reply-To: user-ae9b8668bcde@xymon.invalid<br>&gt;To: &lt;user-ae9b8668bcde@xymon.invalid&gt;<br>&gt;Subject: RE: [hobbit] Status Unavailable<br>&gt;Date: Mon, 4 Jul 2005 16:00:41 +0800<br>&gt;<br>&gt;Hi Stefan Did you ever get your Hobbit up and running again?<br>&gt;<br>&gt;-----Original Message-----<br>&gt;From: Stefan Loos [mailto:user-dea24d965402@xymon.invalid]<br>&gt;Sent: Monday, 4 July 2005 3:04 PM<br>&gt;To: user-ae9b8668bcde@xymon.invalid<br>&gt;Subject: RE: [hobbit] Status Unavailable<br>&gt;<br>&gt;Hello Vernon,<br>&gt;<br>&gt;I think you see the same problem that I reported to Hendrid in May<br>&gt;(bbdisplay problems after adding some hosts).<br>&gt;The only difference is that my hobbit server runs normal as long as no<br>&gt;other hosts are added to bb-hosts.<br>&gt;I wanted to find out if its a OS problem (we use RedHat Enterprise 3)<br>&gt;and installed hobbit on a SuSE system. It shows the same problems....<br>&gt;<br>&gt;Regards,<br>&gt;<br>&gt;Stefan<br>&gt;_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _<br>&gt;<br>&gt;NOTICE: This message and any attachments are confidential and may contain copyright material<br>&gt;of Australian Finance Group Limited or a third party. It is intended solely for the purpose of the<br>&gt;addressee and any other named recipient. If you are not the intended recipient, any use,<br>&gt;distribution, disclosure or copying of this message is strictly prohibited. The confidentiality attached<br>&gt;to this message is not waived or lost by reason of the mistaken transmission or delivery to any<br>&gt;unintended party. If you have received this message in error, please notify the author immediately or<br>&gt;contact Australian Finance Group on +61 8 9420 7888.<br>&gt;<br>&gt;<br>&gt;To unsubscribe from the hobbit list, send an e-mail to<br>&gt;user-095ef1c764a2@xymon.invalid<br>&gt;<br>&gt;<br>
list Stefan Loos · Mon, 04 Jul 2005 08:35:39 +0000 ·
You're running the hobbit server without the HEARTBEAT option - right?

<br><br><br>&gt;From: &quot;Vernon Everett&quot; &lt;user-99fc6b22a3a3@xymon.invalid&gt;<br>&gt;Reply-To: user-ae9b8668bcde@xymon.invalid<br>&gt;To: &lt;user-ae9b8668bcde@xymon.invalid&gt;<br>&gt;Subject: RE: [hobbit] Status Unavailable<br>&gt;Date: Mon, 4 Jul 2005 16:30:48 +0800<br>&gt;<br>&gt;Nope!<br>&gt;Hashed out the [larrdcolumn] and [infocolumn] sections.<br>&gt;Had a look. Saw your shared mem segments.<br>&gt;Didn't even look for semaphores.<br>&gt;Decided a reboot is probably easier.<br>&gt;Hobbit started on its own.<br>&gt;It looked promising, and then after about 2 minutes, back to its usual<br>&gt;crap.<br>&gt;<br>&gt;Green background. Header &amp; footer only :-(<br>&gt;<br>&gt;<br>&gt;<br>&gt;-----Original Message-----<br>&gt;From: Henrik Stoerner [mailto:user-ce4a2c883f75@xymon.invalid]<br>&gt;Sent: Monday, 4 July 2005 4:10 PM<br>&gt;To: user-ae9b8668bcde@xymon.invalid<br>&gt;Subject: Re: [hobbit] Status Unavailable<br>&gt;<br>&gt;Hi Vernon,<br>&gt;<br>&gt;could you try disabling the [larrdcolumn] and [infocolumn] tasks in your<br>&gt;hobbitlaunch.cfg ? These are no longer used, and I suspect that the code<br>&gt;in hobbitd that handles these messages is somewhat flaky.<br>&gt;<br>&gt;If hobbitd refuses to stop, you'll have to do it with a &quot;kill -9&quot;. When<br>&gt;you do that, please check that there are no ressources left allocated by<br>&gt;hobbitd:<br>&gt;<br>&gt;   login as the hobbit user<br>&gt;   run &quot;ipcs -m&quot; and &quot;ipcs -s&quot;<br>&gt;<br>&gt;If you see something like<br>&gt;<br>&gt;   hobbit at osiris:~$ ipcs -m<br>&gt;<br>&gt;   ------ Shared Memory Segments --------<br>&gt;   key        shmid      owner       perms      bytes      nattch<br>&gt;status<br>&gt;   0x01020b1f 482148354  hobbit    600        102400     2<br>&gt;   0x02020b1f 482181123  hobbit    600        102400     2<br>&gt;   0x03020b1f 482213892  hobbit    600        102400     2<br>&gt;   0x04020b1f 482246661  hobbit    600        102400     2<br>&gt;   0x05020b1f 482279430  hobbit    600        102400     1<br>&gt;   0x06020b1f 482312199  hobbit    600        102400     1<br>&gt;<br>&gt;   hobbit at osiris:~$ ipcs -s<br>&gt;<br>&gt;   ------ Semaphore Arrays --------<br>&gt;   key        semid      owner      perms       nsems<br>&gt;   0x01020b1f 3276801    hobbit    600        3<br>&gt;   0x02020b1f 3309570    hobbit    600        3<br>&gt;   0x03020b1f 3342339    hobbit    600        3<br>&gt;   0x04020b1f 3375108    hobbit    600        3<br>&gt;   0x05020b1f 3407877    hobbit    600        3<br>&gt;   0x06020b1f 3440646    hobbit    600        3<br>&gt;<br>&gt;then it hasn't cleaned up properly and you should either reboot the box<br>&gt;or use &quot;ipcrm -m &lt;shmid&gt;&quot; and &quot;ipcrm -s &lt;semid&gt;&quot; to delete these.<br>&gt;<br>&gt;<br>&gt;Regards,<br>&gt;Henrik<br>&gt;<br>&gt;<br>&gt;To unsubscribe from the hobbit list, send an e-mail to<br>&gt;user-095ef1c764a2@xymon.invalid<br>&gt;<br>&gt;<br>&gt;_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _<br>&gt;<br>&gt;NOTICE: This message and any attachments are confidential and may contain copyright material<br>&gt;of Australian Finance Group Limited or a third party. It is intended solely for the purpose of the<br>&gt;addressee and any other named recipient. If you are not the intended recipient, any use,<br>&gt;distribution, disclosure or copying of this message is strictly prohibited. The confidentiality attached<br>&gt;to this message is not waived or lost by reason of the mistaken transmission or delivery to any<br>&gt;unintended party. If you have received this message in error, please notify the author immediately or<br>&gt;contact Australian Finance Group on +61 8 9420 7888.<br>&gt;<br>&gt;<br>&gt;To unsubscribe from the hobbit list, send an e-mail to<br>&gt;user-095ef1c764a2@xymon.invalid<br>&gt;<br>&gt;<br>
list Vernon Everett · Mon, 4 Jul 2005 16:41:03 +0800 ·
Yes.
That's the first thing Henrik asked me to disable,.
quoted from Stefan Loos
 
-----Original Message-----
From: Stefan Loos [mailto:user-dea24d965402@xymon.invalid] Sent: Monday, 4 July 2005 4:36 PM
To: user-ae9b8668bcde@xymon.invalid
Subject: RE: [hobbit] Status Unavailable

You're running the hobbit server without the HEARTBEAT option - right?

_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

NOTICE: This message and any attachments are confidential and may contain copyright material of Australian Finance Group Limited or a third party. It is intended solely for the purpose of the addressee and any other named recipient. If you are not the intended recipient, any use, distribution, disclosure or copying of this message is strictly prohibited. The confidentiality attached
to this message is not waived or lost by reason of the mistaken transmission or delivery to any unintended party. If you have received this message in error, please notify the author immediately or contact Australian Finance Group on +61 8 9420 7888.
list Henrik Størner · Mon, 4 Jul 2005 15:36:37 +0200 ·
quoted from Stefan Loos
On Mon, Jul 04, 2005 at 07:04:02AM +0000, Stefan Loos wrote:
I think you see the same problem that I reported to Hendrid in May (bbdisplay problems after adding some hosts).
The only difference is that my hobbit server runs normal as long as no other hosts are added to bb-hosts.
I wanted to find out if its a OS problem (we use RedHat Enterprise 3) and installed hobbit on a SuSE system. It shows the same problems....
Well, I've found *one* bug that might explain this - but I do not yet
know if it's the one that bites Vernon.

It's triggered by an invalid status message that doesn't include any
column-name. 
Stefan - you might want to try out the current snapshot available
at http://www.hswn.dk/beta/hobbit-snapshot.tar.gz

You build it the normal way (configure; make) - but instead of
running "make install" just copy the hobbitd/hobbitd binary to
your ~hobbit/server/bin/

After restarting Hobbit you should see some "Bogus status message"
entries in the hobbitd.log file if this problem occurs, but it should no longer crash.


Regards,
Henrik
list Stefan Loos · Mon, 04 Jul 2005 14:16:34 +0000 ·
Hello Henrik,

I installed the snapshot as you described - no errors so far.

Regards,

Stefan

<br><br><br>&gt;From: user-ce4a2c883f75@xymon.invalid (Henrik Stoerner)<br>&gt;Reply-To: user-ae9b8668bcde@xymon.invalid<br>&gt;To: user-ae9b8668bcde@xymon.invalid<br>&gt;Subject: Re: [hobbit] Status Unavailable<br>&gt;Date: Mon, 4 Jul 2005 15:36:37 +0200<br>&gt;<br>&gt;On Mon, Jul 04, 2005 at 07:04:02AM +0000, Stefan Loos wrote:<br>&gt; &gt; I think you see the same problem that I reported to Hendrid in May<br>&gt; &gt; (bbdisplay problems after adding some hosts).<br>&gt; &gt; The only difference is that my hobbit server runs normal as long as no<br>&gt; &gt; other hosts are added to bb-hosts.<br>&gt; &gt; I wanted to find out if its a OS problem (we use RedHat Enterprise 3) and<br>&gt; &gt; installed hobbit on a SuSE system. It shows the same problems....<br>&gt;<br>&gt;Well, I've found *one* bug that might explain this - but I do not yet<br>&gt;know if it's the one that bites Vernon.<br>&gt;<br>&gt;It's triggered by an invalid status message that doesn't include any<br>&gt;column-name.<br>&gt;<br>&gt;Stefan - you might want to try out the current snapshot available<br>&gt;at http://www.hswn.dk/beta/hobbit-snapshot.tar.gz<br>&gt;<br>&gt;You build it the normal way (configure; make) - but instead of<br>&gt;running &quot;make install&quot; just copy the hobbitd/hobbitd binary to<br>&gt;your ~hobbit/server/bin/<br>&gt;<br>&gt;After restarting Hobbit you should see some &quot;Bogus status message&quot;<br>&gt;entries in the hobbitd.log file if this problem occurs, but it<br>&gt;should no longer crash.<br>&gt;<br>&gt;<br>&gt;Regards,<br>&gt;Henrik<br>&gt;<br>&gt;<br>&gt;To unsubscribe from the hobbit list, send an e-mail to<br>&gt;user-095ef1c764a2@xymon.invalid<br>&gt;<br>&gt;<br>
list Stefan Loos · Tue, 05 Jul 2005 12:22:58 +0000 ·
Hello Henrik,

the hobbit-server is still running without problems!
And there are no "status board not available" in the bb-display.log. The hobbitd.log shows only the "Setup complete" message since the last restart.
I wait until tomorow and reinstall RedHat on that server (I only have one for testing).
quoted from Stefan Loos

Regards,

Stefan

<br><br><br>&gt;From: user-ce4a2c883f75@xymon.invalid (Henrik Stoerner)<br>&gt;Reply-To: user-ae9b8668bcde@xymon.invalid<br>&gt;To: user-ae9b8668bcde@xymon.invalid<br>&gt;Subject: Re: [hobbit] Status Unavailable<br>&gt;Date: Mon, 4 Jul 2005 15:36:37 +0200<br>&gt;<br>&gt;On Mon, Jul 04, 2005 at 07:04:02AM +0000, Stefan Loos wrote:<br>&gt; &gt; I think you see the same problem that I reported to Hendrid in May<br>&gt; &gt; (bbdisplay problems after adding some hosts).<br>&gt; &gt; The only difference is that my hobbit server runs normal as long as no<br>&gt; &gt; other hosts are added to bb-hosts.<br>&gt; &gt; I wanted to find out if its a OS problem (we use RedHat Enterprise 3) and<br>&gt; &gt; installed hobbit on a SuSE system. It shows the same problems....<br>&gt;<br>&gt;Well, I've found *one* bug that might explain this - but I do not yet<br>&gt;know if it's the one that bites Vernon.<br>&gt;<br>&gt;It's triggered by an invalid status message that doesn't include any<br>&gt;column-name.<br>&gt;<br>&gt;Stefan - you might want to try out the current snapshot available<br>&gt;at http://www.hswn.dk/beta/hobbit-snapshot.tar.gz<br>&gt;<br>&gt;You build it the normal way (configure; make) - but instead of<br>&gt;running &quot;make install&quot; just copy the hobbitd/hobbitd binary to<br>&gt;your ~hobbit/server/bin/<br>&gt;<br>&gt;After restarting Hobbit you should see some &quot;Bogus status message&quot;<br>&gt;entries in the hobbitd.log file if this problem occurs, but it<br>&gt;should no longer crash.<br>&gt;<br>&gt;<br>&gt;Regards,<br>&gt;Henrik<br>&gt;<br>&gt;<br>&gt;To unsubscribe from the hobbit list, send an e-mail to<br>&gt;user-095ef1c764a2@xymon.invalid<br>&gt;<br>&gt;<br>
list Henrik Størner · Tue, 5 Jul 2005 14:33:04 +0200 ·
quoted from Stefan Loos
On Tue, Jul 05, 2005 at 12:22:58PM +0000, Stefan Loos wrote:
the hobbit-server is still running without problems!
And there are no "status board not available" in the bb-display.log. The 
hobbitd.log shows only the "Setup complete" message since the last restart.
And you have added the problematic hosts that used to trigger this
problem ?


Regards,
Henrik
list Stefan Loos · Tue, 05 Jul 2005 14:04:10 +0000 ·
Let me say it in other words: I didn't drop a host from bb-hosts. So if 
there was any problematic host then it's still there.

Regards,

Stefan
quoted from Henrik Størner
From: user-ce4a2c883f75@xymon.invalid (Henrik Stoerner)
Reply-To: user-ae9b8668bcde@xymon.invalid
To: user-ae9b8668bcde@xymon.invalid
Subject: Re: [hobbit] Status Unavailable
Date: Tue, 5 Jul 2005 14:33:04 +0200

On Tue, Jul 05, 2005 at 12:22:58PM +0000, Stefan Loos wrote:
the hobbit-server is still running without problems!
And there are no "status board not available" in the bb-display.log. The
hobbitd.log shows only the "Setup complete" message since the last 
restart.
And you have added the problematic hosts that used to trigger this
problem ?


Regards,
Henrik

list Stefan Loos · Mon, 11 Jul 2005 07:49:08 +0000 ·
Hello Henrik,

after the server runs for a few days the error came back (even with the 
hobbitd of the snapshot-version you gave me).

It would be interesting to hear from Vernon if his server still has this 
problem....

Regards,

Stefan
list Vernon Everett · Mon, 11 Jul 2005 15:56:49 +0800 ·
I am still here, and still having major problems :-(
I thought we had isolated the problem to messages from a particular
monitored client, but after a few hours of smooth running, with messages
from that client disabled, it failed again.

Regards
   Vernon
quoted from Stefan Loos

-----Original Message-----
From: Stefan Loos [mailto:user-dea24d965402@xymon.invalid] Sent: Monday, 11 July 2005 3:49 PM
To: user-ae9b8668bcde@xymon.invalid
Subject: [hobbit] Status Unavailable - again

Hello Henrik,

after the server runs for a few days the error came back (even with the hobbitd of the snapshot-version you gave me).

It would be interesting to hear from Vernon if his server still has this

problem....

Regards,

Stefan


_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

NOTICE: This message and any attachments are confidential and may contain copyright material of Australian Finance Group Limited or a third party. It is intended solely for the purpose of the addressee and any other named recipient. If you are not the intended recipient, any use, distribution, disclosure or copying of this message is strictly prohibited. The confidentiality attached
to this message is not waived or lost by reason of the mistaken transmission or delivery to any unintended party. If you have received this message in error, please notify the author immediately or contact Australian Finance Group on +61 8 9420 7888.
list Stefan Loos · Mon, 11 Jul 2005 08:19:50 +0000 ·
did you use the hobbitd from the snapshot?

<br><br><br>&gt;From: &quot;Vernon Everett&quot; &lt;user-99fc6b22a3a3@xymon.invalid&gt;<br>&gt;Reply-To: user-ae9b8668bcde@xymon.invalid<br>&gt;To: &lt;user-ae9b8668bcde@xymon.invalid&gt;<br>&gt;Subject: RE: [hobbit] Status Unavailable - again<br>&gt;Date: Mon, 11 Jul 2005 15:56:49 +0800<br>&gt;<br>&gt;I am still here, and still having major problems :-(<br>&gt;I thought we had isolated the problem to messages from a particular<br>&gt;monitored client, but after a few hours of smooth running, with messages<br>&gt;from that client disabled, it failed again.<br>&gt;<br>&gt;Regards<br>&gt;    Vernon<br>&gt;<br>&gt;-----Original Message-----<br>&gt;From: Stefan Loos [mailto:user-dea24d965402@xymon.invalid]<br>&gt;Sent: Monday, 11 July 2005 3:49 PM<br>&gt;To: user-ae9b8668bcde@xymon.invalid<br>&gt;Subject: [hobbit] Status Unavailable - again<br>&gt;<br>&gt;Hello Henrik,<br>&gt;<br>&gt;after the server runs for a few days the error came back (even with the<br>&gt;hobbitd of the snapshot-version you gave me).<br>&gt;<br>&gt;It would be interesting to hear from Vernon if his server still has this<br>&gt;<br>&gt;problem....<br>&gt;<br>&gt;Regards,<br>&gt;<br>&gt;Stefan<br>&gt;<br>&gt;<br>&gt;<br>&gt;To unsubscribe from the hobbit list, send an e-mail to<br>&gt;user-095ef1c764a2@xymon.invalid<br>&gt;<br>&gt;<br>&gt;_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _<br>&gt;<br>&gt;NOTICE: This message and any attachments are confidential and may contain copyright material<br>&gt;of Australian Finance Group Limited or a third party. It is intended solely for the purpose of the<br>&gt;addressee and any other named recipient. If you are not the intended recipient, any use,<br>&gt;distribution, disclosure or copying of this message is strictly prohibited. The confidentiality attached<br>&gt;to this message is not waived or lost by reason of the mistaken transmission or delivery to any<br>&gt;unintended party. If you have received this message in error, please notify the author immediately or<br>&gt;contact Australian Finance Group on +61 8 9420 7888.<br>&gt;<br>&gt;<br>&gt;To unsubscribe from the hobbit list, send an e-mail to<br>&gt;user-095ef1c764a2@xymon.invalid<br>&gt;<br>&gt;<br>
list Stefan Loos · Mon, 11 Jul 2005 08:21:14 +0000 ·
did you use the hobbitd from the snapshot?

<br><br><br>&gt;From: &quot;Vernon Everett&quot; &lt;user-99fc6b22a3a3@xymon.invalid&gt;<br>&gt;Reply-To: user-ae9b8668bcde@xymon.invalid<br>&gt;To: &lt;user-ae9b8668bcde@xymon.invalid&gt;<br>&gt;Subject: RE: [hobbit] Status Unavailable - again<br>&gt;Date: Mon, 11 Jul 2005 15:56:49 +0800<br>&gt;<br>&gt;I am still here, and still having major problems :-(<br>&gt;I thought we had isolated the problem to messages from a particular<br>&gt;monitored client, but after a few hours of smooth running, with messages<br>&gt;from that client disabled, it failed again.<br>&gt;<br>&gt;Regards<br>&gt;    Vernon<br>&gt;<br>&gt;-----Original Message-----<br>&gt;From: Stefan Loos [mailto:user-dea24d965402@xymon.invalid]<br>&gt;Sent: Monday, 11 July 2005 3:49 PM<br>&gt;To: user-ae9b8668bcde@xymon.invalid<br>&gt;Subject: [hobbit] Status Unavailable - again<br>&gt;<br>&gt;Hello Henrik,<br>&gt;<br>&gt;after the server runs for a few days the error came back (even with the<br>&gt;hobbitd of the snapshot-version you gave me).<br>&gt;<br>&gt;It would be interesting to hear from Vernon if his server still has this<br>&gt;<br>&gt;problem....<br>&gt;<br>&gt;Regards,<br>&gt;<br>&gt;Stefan<br>&gt;<br>&gt;<br>&gt;<br>&gt;To unsubscribe from the hobbit list, send an e-mail to<br>&gt;user-095ef1c764a2@xymon.invalid<br>&gt;<br>&gt;<br>&gt;_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _<br>&gt;<br>&gt;NOTICE: This message and any attachments are confidential and may contain copyright material<br>&gt;of Australian Finance Group Limited or a third party. It is intended solely for the purpose of the<br>&gt;addressee and any other named recipient. If you are not the intended recipient, any use,<br>&gt;distribution, disclosure or copying of this message is strictly prohibited. The confidentiality attached<br>&gt;to this message is not waived or lost by reason of the mistaken transmission or delivery to any<br>&gt;unintended party. If you have received this message in error, please notify the author immediately or<br>&gt;contact Australian Finance Group on +61 8 9420 7888.<br>&gt;<br>&gt;<br>&gt;To unsubscribe from the hobbit list, send an e-mail to<br>&gt;user-095ef1c764a2@xymon.invalid<br>&gt;<br>&gt;<br>
list Vernon Everett · Mon, 11 Jul 2005 16:37:28 +0800 ·
Yep.
I did.
Between myself and Henrik, we have tried a number of versions, and a few
special diagnostic versions of Hobbit.
We have been working mostly off-list because we have been exchanging
potentially confidential information, and don't believe that our failure
to diagnose a problem is of general interest. (I am sure a full account
of this sad tale will be posted once it is resolved.)

Kudos to Henrik though.
I think he has really tried. He has worked tirelessly to try and resolve
this issue, which I believe is very admirable, when you consider his
reward for all this hard work. 
Henrik is a true mensch. (Yes, that's an English word. Look it up)

So far, we have not been able to identify the root cause of the problem.
Henrik was going to have a look at some of the messages that came from
one of the hosts, and get back to me.

Stefan, if you are interested in assisting us with this, then myself and
Henrik can cc you in our off-list exchanges.

Have you got any theories as to the cause?
quoted from Vernon Everett

Regards
    Vernon


-----Original Message-----
From: Stefan Loos [mailto:user-dea24d965402@xymon.invalid] 
Sent: Monday, 11 July 2005 4:21 PM
To: user-ae9b8668bcde@xymon.invalid
Subject: RE: [hobbit] Status Unavailable - again

did you use the hobbitd from the snapshot?


_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

NOTICE: This message and any attachments are confidential and may contain copyright material 
of Australian Finance Group Limited or a third party. It is intended solely for the purpose of the 
addressee and any other named recipient. If you are not the intended recipient, any use, 
distribution, disclosure or copying of this message is strictly prohibited. The confidentiality attached
to this message is not waived or lost by reason of the mistaken transmission or delivery to any 
unintended party. If you have received this message in error, please notify the author immediately or 
contact Australian Finance Group on +61 8 9420 7888.
list Stefan Loos · Mon, 11 Jul 2005 09:36:42 +0000 ·
It would be great if you can put me in cc. If you want I can try to assist you. I'm at a point where I don't know what to try anymore. I think it isn't easy for Henrik to find this issue - I have no coredumps and nothing in the logfile what could help.
And I never had any doubt that Henrik does a great job! (I think my English is not good enough to say it in other words)
So if there is anything what I can do to solve this problem....

I've tried to lookup "mensch" but I think I'm using the wrong sites - german english dictionaries always recognize mensch as a german word ;-)

Regards,
Stefan

<br><br><br>&gt;From: &quot;Vernon Everett&quot; &lt;user-99fc6b22a3a3@xymon.invalid&gt;<br>&gt;Reply-To: user-ae9b8668bcde@xymon.invalid<br>&gt;To: &lt;user-ae9b8668bcde@xymon.invalid&gt;<br>&gt;Subject: RE: [hobbit] Status Unavailable - again<br>&gt;Date: Mon, 11 Jul 2005 16:37:28 +0800<br>&gt;<br>&gt;Yep.<br>&gt;I did.<br>&gt;Between myself and Henrik, we have tried a number of versions, and a few<br>&gt;special diagnostic versions of Hobbit.<br>&gt;We have been working mostly off-list because we have been exchanging<br>&gt;potentially confidential information, and don't believe that our failure<br>&gt;to diagnose a problem is of general interest. (I am sure a full account<br>&gt;of this sad tale will be posted once it is resolved.)<br>&gt;<br>&gt;Kudos to Henrik though.<br>&gt;I think he has really tried. He has worked tirelessly to try and resolve<br>&gt;this issue, which I believe is very admirable, when you consider his<br>&gt;reward for all this hard work.<br>&gt;Henrik is a true mensch. (Yes, that's an English word. Look it up)<br>&gt;<br>&gt;So far, we have not been able to identify the root cause of the problem.<br>&gt;Henrik was going to have a look at some of the messages that came from<br>&gt;one of the hosts, and get back to me.<br>&gt;<br>&gt;Stefan, if you are interested in assisting us with this, then myself and<br>&gt;Henrik can cc you in our off-list exchanges.<br>&gt;<br>&gt;Have you got any theories as to the cause?<br>&gt;<br>&gt;Regards<br>&gt;     Vernon<br>&gt;<br>&gt;<br>&gt;<br>&gt;-----Original Message-----<br>&gt;From: Stefan Loos [mailto:user-dea24d965402@xymon.invalid]<br>&gt;Sent: Monday, 11 July 2005 4:21 PM<br>&gt;To: user-ae9b8668bcde@xymon.invalid<br>&gt;Subject: RE: [hobbit] Status Unavailable - again<br>&gt;<br>&gt;did you use the hobbitd from the snapshot?<br>&gt;<br>&gt;<br>&gt;_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _<br>&gt;<br>&gt;NOTICE: This message and any attachments are confidential and may contain copyright material<br>&gt;of Australian Finance Group Limited or a third party. It is intended solely for the purpose of the<br>&gt;addressee and any other named recipient. If you are not the intended recipient, any use,<br>&gt;distribution, disclosure or copying of this message is strictly prohibited. The confidentiality attached<br>&gt;to this message is not waived or lost by reason of the mistaken transmission or delivery to any<br>&gt;unintended party. If you have received this message in error, please notify the author immediately or<br>&gt;contact Australian Finance Group on +61 8 9420 7888.<br>&gt;<br>&gt;<br>&gt;To unsubscribe from the hobbit list, send an e-mail to<br>&gt;user-095ef1c764a2@xymon.invalid<br>&gt;<br>&gt;<br>
list Vernon Everett · Mon, 11 Jul 2005 18:01:45 +0800 ·
Try http://www.onelook.com/?w=mensch&ls=a
:-)

Once Hobbit hangs, it refuses to core.
The only thing that will clobber it, is a kill -9
When it hangs, it hangs good.
The logs are devoid of useful info too. Even with the latest build.

This is probably the most frustrating issue I have ever seen.
quoted from Stefan Loos

Regards
    Vernon

-----Original Message-----
From: Stefan Loos [mailto:user-dea24d965402@xymon.invalid] Sent: Monday, 11 July 2005 5:37 PM
To: user-ae9b8668bcde@xymon.invalid
Subject: RE: [hobbit] Status Unavailable - again

It would be great if you can put me in cc. If you want I can try to
assist you. I'm at a point where I don't know what to try anymore. I think it
isn't easy for Henrik to find this issue - I have no coredumps and nothing in
the logfile what could help.
And I never had any doubt that Henrik does a great job! (I think my
English is not good enough to say it in other words)
So if there is anything what I can do to solve this problem....

I've tried to lookup "mensch" but I think I'm using the wrong sites -
german english dictionaries always recognize mensch as a german word ;-)

Regards,
Stefan

<br><br><br>&gt;From: &quot;Vernon Everett&quot; &lt;user-99fc6b22a3a3@xymon.invalid&gt;<br>&gt;Reply-To: user-ae9b8668bcde@xymon.invalid<br>&gt;To: &lt;user-ae9b8668bcde@xymon.invalid&gt;<br>&gt;Subject: RE: [hobbit] Status Unavailable - again<br>&gt;Date: Mon, 11 Jul 2005
16:37:28 +0800<br>&gt;<br>&gt;Yep.<br>&gt;I did.<br>&gt;Between myself and
Henrik, we have tried a number of versions, and a few<br>&gt;special diagnostic versions of Hobbit.<br>&gt;We have been working mostly off-list because
we have been exchanging<br>&gt;potentially confidential information, and
don't believe that our failure<br>&gt;to diagnose a problem is of general interest. (I am sure a full account<br>&gt;of this sad tale will be
posted once it is resolved.)<br>&gt;<br>&gt;Kudos to Henrik though.<br>&gt;I
think he has really tried. He has worked tirelessly to try and
resolve<br>&gt;this issue, which I believe is very admirable, when you consider his<br>&gt;reward for all this hard work.<br>&gt;Henrik is a true
mensch. (Yes, that's an English word. Look it up)<br>&gt;<br>&gt;So far, we have
not been able to identify the root cause of the problem.<br>&gt;Henrik was
going to have a look at some of the messages that came from<br>&gt;one of the hosts, and get back to me.<br>&gt;<br>&gt;Stefan, if you are interested
in assisting us with this, then myself and<br>&gt;Henrik can cc you in our off-list exchanges.<br>&gt;<br>&gt;Have you got any theories as to the cause?<br>&gt;<br>&gt;Regards<br>&gt;     Vernon<br>&gt;<br>&gt;<br>&gt;<br>&gt;-----Original Message-----<br>&gt;From: Stefan Loos [mailto:user-dea24d965402@xymon.invalid]<br>&gt;Sent: Monday, 11 July 2005 4:21 PM<br>&gt;To: user-ae9b8668bcde@xymon.invalid<br>&gt;Subject: RE: [hobbit] Status
Unavailable - again<br>&gt;<br>&gt;did you use the hobbitd from the snapshot?<br>&gt;<br>&gt;<br>&gt;_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _<br>&gt;<br>&gt;NOTICE: This message and
any attachments are confidential and may contain copyright
material<br>&gt;of Australian Finance Group Limited or a third party. It is intended solely
for the purpose of the<br>&gt;addressee and any other named recipient. If
you are not the intended recipient, any use,<br>&gt;distribution, disclosure
or copying of this message is strictly prohibited. The confidentiality attached<br>&gt;to this message is not waived or lost by reason of the mistaken transmission or delivery to any<br>&gt;unintended party. If you

have received this message in error, please notify the author
immediately or<br>&gt;contact Australian Finance Group on +61 8 9420 7888.<br>&gt;<br>&gt;<br>&gt;To unsubscribe from the hobbit list, send
an e-mail to<br>&gt;user-095ef1c764a2@xymon.invalid<br>&gt;<br>&gt;<br>


_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

NOTICE: This message and any attachments are confidential and may contain copyright material of Australian Finance Group Limited or a third party. It is intended solely for the purpose of the addressee and any other named recipient. If you are not the intended recipient, any use, distribution, disclosure or copying of this message is strictly prohibited. The confidentiality attached
to this message is not waived or lost by reason of the mistaken transmission or delivery to any unintended party. If you have received this message in error, please notify the author immediately or contact Australian Finance Group on +61 8 9420 7888.
list Stefan Loos · Mon, 11 Jul 2005 10:47:46 +0000 ·
Funny thing - your link brought me to http://germanenglishwords.com. Are you really using this words in english? I always thought that kindergarten and sauerkraut are the only few words which are used in english spoken countries....


Yes its really frustrating!
So many others here in this mailing list don't have that problem.
Are you using subpages in bb-hosts? Do you have many own-written monitoring scripts?

Regards,

Stefan


<br><br><br>&gt;From: &quot;Vernon Everett&quot; &lt;user-99fc6b22a3a3@xymon.invalid&gt;<br>&gt;Reply-To: user-ae9b8668bcde@xymon.invalid<br>&gt;To: &lt;user-ae9b8668bcde@xymon.invalid&gt;<br>&gt;Subject: RE: [hobbit] Status Unavailable - again<br>&gt;Date: Mon, 11 Jul 2005 18:01:45 +0800<br>&gt;<br>&gt;Try http://www.onelook.com/?w=mensch&amp;ls=a<br>&gt;:-)<br>&gt;<br>&gt;Once Hobbit hangs, it refuses to core.<br>&gt;The only thing that will clobber it, is a kill -9<br>&gt;When it hangs, it hangs good.<br>&gt;The logs are devoid of useful info too. Even with the latest build.<br>&gt;<br>&gt;This is probably the most frustrating issue I have ever seen.<br>&gt;<br>&gt;Regards<br>&gt;     Vernon<br>&gt;<br>&gt;-----Original Message-----<br>&gt;From: Stefan Loos [mailto:user-dea24d965402@xymon.invalid]<br>&gt;Sent: Monday, 11 July 2005 5:37 PM<br>&gt;To: user-ae9b8668bcde@xymon.invalid<br>&gt;Subject: RE: [hobbit] Status Unavailable - again<br>&gt;<br>&gt;It would be great if you can put me in cc. If you want I can try to<br>&gt;assist<br>&gt;you. I'm at a point where I don't know what to try anymore. I think it<br>&gt;isn't<br>&gt;easy for Henrik to find this issue - I have no coredumps and nothing in<br>&gt;the<br>&gt;logfile what could help.<br>&gt;And I never had any doubt that Henrik does a great job! (I think my<br>&gt;English<br>&gt;is not good enough to say it in other words)<br>&gt;So if there is anything what I can do to solve this problem....<br>&gt;<br>&gt;I've tried to lookup &quot;mensch&quot; but I think I'm using the wrong sites -<br>&gt;german<br>&gt;english dictionaries always recognize mensch as a german word ;-)<br>&gt;<br>&gt;Regards,<br>&gt;Stefan<br>&gt;<br>&gt;&lt;br&gt;&lt;br&gt;&lt;br&gt;&amp;gt;From: &amp;quot;Vernon Everett&amp;quot;<br>&gt;&amp;lt;user-99fc6b22a3a3@xymon.invalid&amp;gt;&lt;br&gt;&amp;gt;Reply-To:<br>&gt;user-ae9b8668bcde@xymon.invalid&lt;br&gt;&amp;gt;To: &amp;lt;user-ae9b8668bcde@xymon.invalid&amp;gt;&lt;br&gt;&amp;gt;Subject: RE:<br>&gt;[hobbit] Status Unavailable - again&lt;br&gt;&amp;gt;Date: Mon, 11 Jul 2005<br>&gt;16:37:28<br>&gt;+0800&lt;br&gt;&amp;gt;&lt;br&gt;&amp;gt;Yep.&lt;br&gt;&amp;gt;I did.&lt;br&gt;&amp;gt;Between myself and<br>&gt;Henrik, we<br>&gt;have tried a number of versions, and a few&lt;br&gt;&amp;gt;special diagnostic<br>&gt;versions of Hobbit.&lt;br&gt;&amp;gt;We have been working mostly off-list because<br>&gt;we<br>&gt;have been exchanging&lt;br&gt;&amp;gt;potentially confidential information, and<br>&gt;don't<br>&gt;believe that our failure&lt;br&gt;&amp;gt;to diagnose a problem is of general<br>&gt;interest. (I am sure a full account&lt;br&gt;&amp;gt;of this sad tale will be<br>&gt;posted<br>&gt;once it is resolved.)&lt;br&gt;&amp;gt;&lt;br&gt;&amp;gt;Kudos to Henrik though.&lt;br&gt;&amp;gt;I<br>&gt;think<br>&gt;he has really tried. He has worked tirelessly to try and<br>&gt;resolve&lt;br&gt;&amp;gt;this<br>&gt;issue, which I believe is very admirable, when you consider<br>&gt;his&lt;br&gt;&amp;gt;reward for all this hard work.&lt;br&gt;&amp;gt;Henrik is a true<br>&gt;mensch.<br>&gt;(Yes, that's an English word. Look it up)&lt;br&gt;&amp;gt;&lt;br&gt;&amp;gt;So far, we have<br>&gt;not<br>&gt;been able to identify the root cause of the problem.&lt;br&gt;&amp;gt;Henrik was<br>&gt;going<br>&gt;to have a look at some of the messages that came from&lt;br&gt;&amp;gt;one of the<br>&gt;hosts, and get back to me.&lt;br&gt;&amp;gt;&lt;br&gt;&amp;gt;Stefan, if you are interested<br>&gt;in<br>&gt;assisting us with this, then myself and&lt;br&gt;&amp;gt;Henrik can cc you in our<br>&gt;off-list exchanges.&lt;br&gt;&amp;gt;&lt;br&gt;&amp;gt;Have you got any theories as to the<br>&gt;cause?&lt;br&gt;&amp;gt;&lt;br&gt;&amp;gt;Regards&lt;br&gt;&amp;gt;<br>&gt;Vernon&lt;br&gt;&amp;gt;&lt;br&gt;&amp;gt;&lt;br&gt;&amp;gt;&lt;br&gt;&amp;gt;-----Original<br>&gt;Message-----&lt;br&gt;&amp;gt;From: Stefan Loos<br>&gt;[mailto:user-dea24d965402@xymon.invalid]&lt;br&gt;&amp;gt;Sent: Monday, 11 July 2005 4:21<br>&gt;PM&lt;br&gt;&amp;gt;To: user-ae9b8668bcde@xymon.invalid&lt;br&gt;&amp;gt;Subject: RE: [hobbit] Status<br>&gt;Unavailable<br>&gt;- again&lt;br&gt;&amp;gt;&lt;br&gt;&amp;gt;did you use the hobbitd from the<br>&gt;snapshot?&lt;br&gt;&amp;gt;&lt;br&gt;&amp;gt;&lt;br&gt;&amp;gt;_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _<br>&gt;_ _<br>&gt;_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _&lt;br&gt;&amp;gt;&lt;br&gt;&amp;gt;NOTICE: This message and<br>&gt;any<br>&gt;attachments are confidential and may contain copyright<br>&gt;material&lt;br&gt;&amp;gt;of<br>&gt;Australian Finance Group Limited or a third party. It is intended solely<br>&gt;for<br>&gt;the purpose of the&lt;br&gt;&amp;gt;addressee and any other named recipient. If<br>&gt;you<br>&gt;are not the intended recipient, any use,&lt;br&gt;&amp;gt;distribution, disclosure<br>&gt;or<br>&gt;copying of this message is strictly prohibited. The confidentiality<br>&gt;attached&lt;br&gt;&amp;gt;to this message is not waived or lost by reason of the<br>&gt;mistaken transmission or delivery to any&lt;br&gt;&amp;gt;unintended party. If you<br>&gt;<br>&gt;have received this message in error, please notify the author<br>&gt;immediately<br>&gt;or&lt;br&gt;&amp;gt;contact Australian Finance Group on +61 8 9420<br>&gt;7888.&lt;br&gt;&amp;gt;&lt;br&gt;&amp;gt;&lt;br&gt;&amp;gt;To unsubscribe from the hobbit list, send<br>&gt;an<br>&gt;e-mail to&lt;br&gt;&amp;gt;user-095ef1c764a2@xymon.invalid&lt;br&gt;&amp;gt;&lt;br&gt;&amp;gt;&lt;br&gt;<br>&gt;<br>&gt;<br>&gt;<br>&gt;To unsubscribe from the hobbit list, send an e-mail to<br>&gt;user-095ef1c764a2@xymon.invalid<br>&gt;<br>&gt;<br>&gt;_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _<br>&gt;<br>&gt;NOTICE: This message and any attachments are confidential and may contain copyright material<br>&gt;of Australian Finance Group Limited or a third party. It is intended solely for the purpose of the<br>&gt;addressee and any other named recipient. If you are not the intended recipient, any use,<br>&gt;distribution, disclosure or copying of this message is strictly prohibited. The confidentiality attached<br>&gt;to this message is not waived or lost by reason of the mistaken transmission or delivery to any<br>&gt;unintended party. If you have received this message in error, please notify the author immediately or<br>&gt;contact Australian Finance Group on +61 8 9420 7888.<br>&gt;<br>&gt;<br>&gt;To unsubscribe from the hobbit list, send an e-mail to<br>&gt;user-095ef1c764a2@xymon.invalid<br>&gt;<br>&gt;<br>
list Henrik Størner · Mon, 11 Jul 2005 13:08:59 +0200 ·
Hi Stefan,
quoted from Stefan Loos

On Mon, Jul 11, 2005 at 09:36:42AM +0000, Stefan Loos wrote:
It would be great if you can put me in cc. If you want I can try to assist 
you. I'm at a point where I don't know what to try anymore. I think it 
isn't easy for Henrik to find this issue - I have no coredumps and nothing 
in the logfile what could help.
Yes, this is a really nasty problem. Vernon and I though we had it
nailed down by the end of last week, but there's more to it than
what we found then.

What kind of external scripts run on your clients, apart from the BB
client ? The current suspicion is that this is triggered by a status
message that is handled badly by Hobbit causing this lock-up. So I'm
trying to see if there might be something in common between your setups.

And what kind of system are you running Hobbit on ? If Linux, which
distribution ? Another suspicion I have is that this might be a 
problem with the implementation of SysV IPC semaphores.


Regards,
Henrik
list Stefan Loos · Mon, 11 Jul 2005 12:11:46 +0000 ·
Hi Henrik,

we have several own-written scripts (mostly in perl) which monitor oracle instances, bea weblogic servers. There is one for hardware monitoring - HP (Intel) and Sun Servers (prtdiag based) and some are just the output of a http request to the software running on that weblogic servers.
I have just one server for testing, it's a HP DL 360. We are running Redhat Enterprise Server 3 but I've tried it with a SuSE 9.3 too.

Regards,

Stefan

<br><br><br>&gt;From: user-ce4a2c883f75@xymon.invalid (Henrik Stoerner)<br>&gt;Reply-To: user-ae9b8668bcde@xymon.invalid<br>&gt;To: user-ae9b8668bcde@xymon.invalid<br>&gt;Subject: Re: [hobbit] Status Unavailable - again<br>&gt;Date: Mon, 11 Jul 2005 13:08:59 +0200<br>&gt;<br>&gt;Hi Stefan,<br>&gt;<br>&gt;On Mon, Jul 11, 2005 at 09:36:42AM +0000, Stefan Loos wrote:<br>&gt; &gt; It would be great if you can put me in cc. If you want I can try to assist<br>&gt; &gt; you. I'm at a point where I don't know what to try anymore. I think it<br>&gt; &gt; isn't easy for Henrik to find this issue - I have no coredumps and nothing<br>&gt; &gt; in the logfile what could help.<br>&gt;<br>&gt;Yes, this is a really nasty problem. Vernon and I though we had it<br>&gt;nailed down by the end of last week, but there's more to it than<br>&gt;what we found then.<br>&gt;<br>&gt;What kind of external scripts run on your clients, apart from the BB<br>&gt;client ? The current suspicion is that this is triggered by a status<br>&gt;message that is handled badly by Hobbit causing this lock-up. So I'm<br>&gt;trying to see if there might be something in common between your setups.<br>&gt;<br>&gt;And what kind of system are you running Hobbit on ? If Linux, which<br>&gt;distribution ? Another suspicion I have is that this might be a<br>&gt;problem with the implementation of SysV IPC semaphores.<br>&gt;<br>&gt;<br>&gt;Regards,<br>&gt;Henrik<br>&gt;<br>&gt;<br>&gt;To unsubscribe from the hobbit list, send an e-mail to<br>&gt;user-095ef1c764a2@xymon.invalid<br>&gt;<br>&gt;<br>
list Reif Jeffery M · Mon, 11 Jul 2005 07:20:12 -0500 ·
This is a shot in the dark - I had a lockup problem on a BB system where
the kernel was compiled with a different version compiler than was
included with the system and there were some IPC-related changes in
compiler versions.
quoted from Stefan Loos

-----Original Message-----
From: Stefan Loos [mailto:user-dea24d965402@xymon.invalid] Sent: Monday, July 11, 2005 7:12 AM
To: user-ae9b8668bcde@xymon.invalid
Subject: Re: [hobbit] Status Unavailable - again

Hi Henrik,

we have several own-written scripts (mostly in perl) which monitor
oracle instances, bea weblogic servers. There is one for hardware monitoring -
HP (Intel) and Sun Servers (prtdiag based) and some are just the output of
a http request to the software running on that weblogic servers.
I have just one server for testing, it's a HP DL 360. We are running
Redhat Enterprise Server 3 but I've tried it with a SuSE 9.3 too.

Regards,

Stefan

<br><br><br>&gt;From: user-ce4a2c883f75@xymon.invalid (Henrik Stoerner)<br>&gt;Reply-To: user-ae9b8668bcde@xymon.invalid<br>&gt;To: user-ae9b8668bcde@xymon.invalid<br>&gt;Subject: Re: [hobbit]
Status Unavailable - again<br>&gt;Date: Mon, 11 Jul 2005 13:08:59 +0200<br>&gt;<br>&gt;Hi Stefan,<br>&gt;<br>&gt;On Mon, Jul 11, 2005 at 09:36:42AM +0000, Stefan Loos wrote:<br>&gt; &gt; It would be great if
you can put me in cc. If you want I can try to assist<br>&gt; &gt; you. I'm
at a point where I don't know what to try anymore. I think it<br>&gt; &gt;
isn't easy for Henrik to find this issue - I have no coredumps and
nothing<br>&gt; &gt; in the logfile what could help.<br>&gt;<br>&gt;Yes, this is a
really nasty problem. Vernon and I though we had it<br>&gt;nailed down by the
end of last week, but there's more to it than<br>&gt;what we found then.<br>&gt;<br>&gt;What kind of external scripts run on your clients, apart from the BB<br>&gt;client ? The current suspicion is that this is triggered by a status<br>&gt;message that is handled badly by Hobbit
causing this lock-up. So I'm<br>&gt;trying to see if there might be something in

common between your setups.<br>&gt;<br>&gt;And what kind of system are
you running Hobbit on ? If Linux, which<br>&gt;distribution ? Another
suspicion I have is that this might be a<br>&gt;problem with the implementation of

SysV IPC semaphores.<br>&gt;<br>&gt;<br>&gt;Regards,<br>&gt;Henrik<br>&gt;<br>&gt
;<br>&gt;To unsubscribe from the hobbit list, send an e-mail to<br>&gt;user-095ef1c764a2@xymon.invalid<br>&gt;<br>&gt;<br>
list Vernon Everett · Mon, 11 Jul 2005 20:56:16 +0800 ·
English has no shame.
It will borrow beg or steal words from any language :-)
Mensch, was taken from Yiddish - A hybrid of old German and Hebrew.

From Onelook.com
Quick definitions (mensch)

noun:   a decent responsible person with admirable characteristics

From MSN Encarta
mensch (plural mensch*en or mensch*es) or mensh        (plural mensh*en or mensh*es)
noun good person: somebody good, kind, decent, and honorable ( informal )

[Mid-20th century. Via Yiddish < Old High German mennisco "person,
human"]
quoted from Stefan Loos


-----Original Message-----
From: Stefan Loos [mailto:user-dea24d965402@xymon.invalid] Sent: Monday, 11 July 2005 6:48 PM
To: user-ae9b8668bcde@xymon.invalid
Subject: RE: [hobbit] Status Unavailable - again

Funny thing - your link brought me to http://germanenglishwords.com. Are
you really using this words in english? I always thought that kindergarten
and sauerkraut are the only few words which are used in english spoken countries....


Yes its really frustrating!
So many others here in this mailing list don't have that problem.
Are you using subpages in bb-hosts? Do you have many own-written
monitoring scripts?

Regards,

Stefan


<br><br><br>&gt;From: &quot;Vernon Everett&quot; &lt;user-99fc6b22a3a3@xymon.invalid&gt;<br>&gt;Reply-To: user-ae9b8668bcde@xymon.invalid<br>&gt;To: &lt;user-ae9b8668bcde@xymon.invalid&gt;<br>&gt;Subject: RE: [hobbit] Status Unavailable - again<br>&gt;Date: Mon, 11 Jul 2005
18:01:45 +0800<br>&gt;<br>&gt;Try http://www.onelook.com/?w=mensch&amp;ls=a<br>&gt;:-)<br>&gt;<br>&gt;Once

Hobbit hangs, it refuses to core.<br>&gt;The only thing that will
clobber it, is a kill -9<br>&gt;When it hangs, it hangs good.<br>&gt;The logs
are devoid of useful info too. Even with the latest
build.<br>&gt;<br>&gt;This is probably the most frustrating issue I have ever seen.<br>&gt;<br>&gt;Regards<br>&gt;
Vernon<br>&gt;<br>&gt;-----Original Message-----<br>&gt;From: Stefan Loos [mailto:user-dea24d965402@xymon.invalid]<br>&gt;Sent: Monday, 11 July 2005 5:37 PM<br>&gt;To: user-ae9b8668bcde@xymon.invalid<br>&gt;Subject: RE: [hobbit] Status
Unavailable - again<br>&gt;<br>&gt;It would be great if you can put me in cc. If you

want I can try to<br>&gt;assist<br>&gt;you. I'm at a point where I don't

know what to try anymore. I think it<br>&gt;isn't<br>&gt;easy for Henrik
to find this issue - I have no coredumps and nothing in<br>&gt;the<br>&gt;logfile what could help.<br>&gt;And I never had any

doubt that Henrik does a great job! (I think my<br>&gt;English<br>&gt;is
not good enough to say it in other words)<br>&gt;So if there is anything
what I can do to solve this problem....<br>&gt;<br>&gt;I've tried to lookup &quot;mensch&quot; but I think I'm using the wrong sites -<br>&gt;german<br>&gt;english dictionaries always recognize mensch as a

german word ;-)<br>&gt;<br>&gt;Regards,<br>&gt;Stefan<br>&gt;<br>&gt;&lt;br&gt;&lt;b
r&gt;&lt;br&gt;&amp;gt;From: &amp;quot;Vernon Everett&amp;quot;<br>&gt;&amp;lt;user-99fc6b22a3a3@xymon.invalid&amp;gt;&lt;b
r&gt;&amp;gt;Reply-To:<br>&gt;user-ae9b8668bcde@xymon.invalid&lt;br&gt;&amp;gt;To: &amp;lt;user-ae9b8668bcde@xymon.invalid&amp;gt;&lt;br&gt;&amp;gt;Subject:
RE:<br>&gt;[hobbit] Status Unavailable - again&lt;br&gt;&amp;gt;Date: Mon, 11 Jul 2005<br>&gt;16:37:28<br>&gt;+0800&lt;br&gt;&amp;gt;&lt;br&gt;&amp;gt;Yep
.&lt;br&gt;&amp;gt;I did.&lt;br&gt;&amp;gt;Between myself and<br>&gt;Henrik, we<br>&gt;have
quoted from Stefan Loos
tried a number of versions, and a few&lt;br&gt;&amp;gt;special diagnostic<br>&gt;versions of Hobbit.&lt;br&gt;&amp;gt;We have been
working mostly off-list because<br>&gt;we<br>&gt;have been exchanging&lt;br&gt;&amp;gt;potentially confidential information, and<br>&gt;don't<br>&gt;believe that our failure&lt;br&gt;&amp;gt;to diagnose a problem is of general<br>&gt;interest. (I am sure a full account&lt;br&gt;&amp;gt;of this sad tale will
be<br>&gt;posted<br>&gt;once it is resolved.)&lt;br&gt;&amp;gt;&lt;br&gt;&amp;gt;Kudos to Henrik though.&lt;br&gt;&amp;gt;I<br>&gt;think<br>&gt;he has really tried. He
has worked tirelessly to try and<br>&gt;resolve&lt;br&gt;&amp;gt;this<br>&gt;issue, which I believe
is very admirable, when you consider<br>&gt;his&lt;br&gt;&amp;gt;reward for
all this hard work.&lt;br&gt;&amp;gt;Henrik is a true<br>&gt;mensch.<br>&gt;(Yes, that's an English word. Look it up)&lt;br&gt;&amp;gt;&lt;br&gt;&amp;gt;So far, we have<br>&gt;not<br>&gt;been able to identify the root cause of the problem.&lt;br&gt;&amp;gt;Henrik was<br>&gt;going<br>&gt;to have a look
at some of the messages that came from&lt;br&gt;&amp;gt;one of the<br>&gt;hosts, and get back to me.&lt;br&gt;&amp;gt;&lt;br&gt;&amp;gt;Stefan, if you are interested<br>&gt;in<br>&gt;assisting us with this, then myself and&lt;br&gt;&amp;gt;Henrik can cc you in our<br>&gt;off-list exchanges.&lt;br&gt;&amp;gt;&lt;br&gt;&amp;gt;Have you got any theories

as to the<br>&gt;cause?&lt;br&gt;&amp;gt;&lt;br&gt;&amp;gt;Regards&lt;br&gt;&a
mp;gt;<br>&gt;Vernon&lt;br&gt;&amp;gt;&lt;br&gt;&amp;gt;&lt;br&gt;&amp;g
t;&lt;br&gt;&amp;gt;-----Original<br>&gt;Message-----&lt;br&gt;&amp;gt;F
rom: Stefan
Loos<br>&gt;[mailto:user-dea24d965402@xymon.invalid]&lt;br&gt;&amp;gt;Sent: Monday, 11 July 2005 4:21<br>&gt;PM&lt;br&gt;&amp;gt;To: user-ae9b8668bcde@xymon.invalid&lt;br&gt;&amp;gt;Subject: RE: [hobbit] Status<br>&gt;Unavailable<br>&gt;- again&lt;br&gt;&amp;gt;&lt;br&gt;&amp;gt;did you use the hobbitd from the<br>&gt;snapshot?&lt;br&gt;&amp;gt;&lt;br&gt;&amp;gt;&lt;br&gt;&amp;g
t;_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _<br>&gt;_ _<br>&gt;_ _ _ _ _ _ _ _
quoted from Stefan Loos
_ _ _ _ _ _ _ _&lt;br&gt;&amp;gt;&lt;br&gt;&amp;gt;NOTICE: This message and<br>&gt;any<br>&gt;attachments are confidential and may contain copyright<br>&gt;material&lt;br&gt;&amp;gt;of<br>&gt;Australian Finance Group Limited or a third party. It is intended
solely<br>&gt;for<br>&gt;the purpose of the&lt;br&gt;&amp;gt;addressee and any other named recipient.

If<br>&gt;you<br>&gt;are not the intended recipient, any use,&lt;br&gt;&amp;gt;distribution, disclosure<br>&gt;or<br>&gt;copying
of this message is strictly prohibited. The confidentiality<br>&gt;attached&lt;br&gt;&amp;gt;to this message is not waived or lost by reason of the<br>&gt;mistaken transmission or delivery
to any&lt;br&gt;&amp;gt;unintended party. If you<br>&gt;<br>&gt;have

received this message in error, please notify the author<br>&gt;immediately<br>&gt;or&lt;br&gt;&amp;gt;contact Australian Finance Group on +61 8 9420<br>&gt;7888.&lt;br&gt;&amp;gt;&lt;br&gt;&amp;gt;&lt;br&gt;&amp;gt;T
o unsubscribe from the hobbit list, send<br>&gt;an<br>&gt;e-mail to&lt;br&gt;&amp;gt;user-095ef1c764a2@xymon.invalid&lt;br&gt;&amp;gt;&lt;br&g
t;&amp;gt;&lt;br&gt;<br>&gt;<br>&gt;<br>&gt;<br>&gt;To unsubscribe from the hobbit list, send an e-mail to<br>&gt;user-095ef1c764a2@xymon.invalid<br>&gt;<br>&gt;<br>&gt;_ _ _ _ _ _
quoted from Stefan Loos
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _<br>&gt;<br>&gt;NOTICE: This message and any attachments are
confidential and may contain copyright material<br>&gt;of Australian Finance Group Limited or a third party. It is intended solely for the purpose of the<br>&gt;addressee and any other named recipient. If you are not the intended recipient, any use,<br>&gt;distribution, disclosure or copying
of this message is strictly prohibited. The confidentiality
attached<br>&gt;to this message is not waived or lost by reason of the mistaken
transmission or delivery to any<br>&gt;unintended party. If you have received this
message in error, please notify the author immediately or<br>&gt;contact
Australian Finance Group on +61 8 9420 7888.<br>&gt;<br>&gt;<br>&gt;To unsubscribe
from the hobbit list, send an e-mail to<br>&gt;user-095ef1c764a2@xymon.invalid<br>&gt;<br>&gt;<br>


_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

NOTICE: This message and any attachments are confidential and may contain copyright material of Australian Finance Group Limited or a third party. It is intended solely for the purpose of the addressee and any other named recipient. If you are not the intended recipient, any use, distribution, disclosure or copying of this message is strictly prohibited. The confidentiality attached
to this message is not waived or lost by reason of the mistaken transmission or delivery to any unintended party. If you have received this message in error, please notify the author immediately or contact Australian Finance Group on +61 8 9420 7888.
list Stefan Loos · Mon, 11 Jul 2005 13:23:32 +0000 ·
Hello Jeffery,

do you think we see this issue because the kernel was built with a different compiler than I was using for building the hobbit server?

@Henrik - do you think this could be the problem?

Regards,
Stefan


<br><br><br>&gt;From: &quot;Reif Jeffery M&quot; &lt;user-e9cc5d6c2490@xymon.invalid&gt;<br>&gt;Reply-To: user-ae9b8668bcde@xymon.invalid<br>&gt;To: user-ae9b8668bcde@xymon.invalid<br>&gt;Subject: RE: [hobbit] Status Unavailable - again<br>&gt;Date: Mon, 11 Jul 2005 07:20:12 -0500<br>&gt;<br>&gt;This is a shot in the dark - I had a lockup problem on a BB system where<br>&gt;the kernel was compiled with a different version compiler than was<br>&gt;included with the system and there were some IPC-related changes in<br>&gt;compiler versions.<br>&gt;<br>&gt;-----Original Message-----<br>&gt;From: Stefan Loos [mailto:user-dea24d965402@xymon.invalid]<br>&gt;Sent: Monday, July 11, 2005 7:12 AM<br>&gt;To: user-ae9b8668bcde@xymon.invalid<br>&gt;Subject: Re: [hobbit] Status Unavailable - again<br>&gt;<br>&gt;Hi Henrik,<br>&gt;<br>&gt;we have several own-written scripts (mostly in perl) which monitor<br>&gt;oracle<br>&gt;instances, bea weblogic servers. There is one for hardware monitoring -<br>&gt;HP<br>&gt;(Intel) and Sun Servers (prtdiag based) and some are just the output of<br>&gt;a<br>&gt;http request to the software running on that weblogic servers.<br>&gt;I have just one server for testing, it's a HP DL 360. We are running<br>&gt;Redhat<br>&gt;Enterprise Server 3 but I've tried it with a SuSE 9.3 too.<br>&gt;<br>&gt;Regards,<br>&gt;<br>&gt;Stefan<br>&gt;<br>&gt;&lt;br&gt;&lt;br&gt;&lt;br&gt;&amp;gt;From: user-ce4a2c883f75@xymon.invalid (Henrik Stoerner)&lt;br&gt;&amp;gt;Reply-To:<br>&gt;user-ae9b8668bcde@xymon.invalid&lt;br&gt;&amp;gt;To: user-ae9b8668bcde@xymon.invalid&lt;br&gt;&amp;gt;Subject: Re: [hobbit]<br>&gt;Status<br>&gt;Unavailable - again&lt;br&gt;&amp;gt;Date: Mon, 11 Jul 2005 13:08:59<br>&gt;+0200&lt;br&gt;&amp;gt;&lt;br&gt;&amp;gt;Hi Stefan,&lt;br&gt;&amp;gt;&lt;br&gt;&amp;gt;On Mon, Jul 11, 2005 at<br>&gt;09:36:42AM +0000, Stefan Loos wrote:&lt;br&gt;&amp;gt; &amp;gt; It would be great if<br>&gt;you<br>&gt;can put me in cc. If you want I can try to assist&lt;br&gt;&amp;gt; &amp;gt; you. I'm<br>&gt;at a<br>&gt;point where I don't know what to try anymore. I think it&lt;br&gt;&amp;gt; &amp;gt;<br>&gt;isn't<br>&gt;easy for Henrik to find this issue - I have no coredumps and<br>&gt;nothing&lt;br&gt;&amp;gt;<br>&gt;&amp;gt; in the logfile what could help.&lt;br&gt;&amp;gt;&lt;br&gt;&amp;gt;Yes, this is a<br>&gt;really<br>&gt;nasty problem. Vernon and I though we had it&lt;br&gt;&amp;gt;nailed down by the<br>&gt;end<br>&gt;of last week, but there's more to it than&lt;br&gt;&amp;gt;what we found<br>&gt;then.&lt;br&gt;&amp;gt;&lt;br&gt;&amp;gt;What kind of external scripts run on your clients,<br>&gt;apart from the BB&lt;br&gt;&amp;gt;client ? The current suspicion is that this is<br>&gt;triggered by a status&lt;br&gt;&amp;gt;message that is handled badly by Hobbit<br>&gt;causing<br>&gt;this lock-up. So I'm&lt;br&gt;&amp;gt;trying to see if there might be something in<br>&gt;<br>&gt;common between your setups.&lt;br&gt;&amp;gt;&lt;br&gt;&amp;gt;And what kind of system are<br>&gt;you<br>&gt;running Hobbit on ? If Linux, which&lt;br&gt;&amp;gt;distribution ? Another<br>&gt;suspicion<br>&gt;I have is that this might be a&lt;br&gt;&amp;gt;problem with the implementation of<br>&gt;<br>&gt;SysV IPC<br>&gt;semaphores.&lt;br&gt;&amp;gt;&lt;br&gt;&amp;gt;&lt;br&gt;&amp;gt;Regards,&lt;br&gt;&amp;gt;Henrik&lt;br&gt;&amp;gt;&lt;br&gt;&amp;gt<br>&gt;;&lt;br&gt;&amp;gt;To<br>&gt;unsubscribe from the hobbit list, send an e-mail<br>&gt;to&lt;br&gt;&amp;gt;user-095ef1c764a2@xymon.invalid&lt;br&gt;&amp;gt;&lt;br&gt;&amp;gt;&lt;br&gt;<br>&gt;<br>&gt;<br>&gt;<br>&gt;To unsubscribe from the hobbit list, send an e-mail to<br>&gt;user-095ef1c764a2@xymon.invalid<br>&gt;<br>&gt;<br>&gt;<br>&gt;<br>&gt;<br>&gt;To unsubscribe from the hobbit list, send an e-mail to<br>&gt;user-095ef1c764a2@xymon.invalid<br>&gt;<br>&gt;<br>
list Reif Jeffery M · Mon, 11 Jul 2005 08:34:34 -0500 ·
Stefan,

I don't know if this is your problem.  I just suggested it as a
possibility for you to consider.  I originally saw random lockups on a
BB system after upgrading from Redhat 7 to (8 or 9, I don't remember).
I thought this might be possible on other versions or distributions as
well.   
The solution in my case was to either:  
1) Down-grade the compiler to match the OS .
2) Recompile the kernel.
3) Change OS versions (sounds like you may have tried this).

I hope this helps in some way.  Good luck on your problem.

Jeff
quoted from Stefan Loos


-----Original Message-----
From: Stefan Loos [mailto:user-dea24d965402@xymon.invalid] Sent: Monday, July 11, 2005 8:24 AM
To: user-ae9b8668bcde@xymon.invalid
Subject: RE: [hobbit] Status Unavailable - again

Hello Jeffery,

do you think we see this issue because the kernel was built with a
different compiler than I was using for building the hobbit server?

@Henrik - do you think this could be the problem?

Regards,
Stefan


<br><br><br>&gt;From: &quot;Reif Jeffery M&quot; &lt;user-e9cc5d6c2490@xymon.invalid&gt;<br>&gt;Reply-To: user-ae9b8668bcde@xymon.invalid<br>&gt;To: user-ae9b8668bcde@xymon.invalid<br>&gt;Subject: RE: [hobbit]
Status Unavailable - again<br>&gt;Date: Mon, 11 Jul 2005 07:20:12 -0500<br>&gt;<br>&gt;This is a shot in the dark - I had a lockup problem
on a BB system where<br>&gt;the kernel was compiled with a different
version compiler than was<br>&gt;included with the system and there were some IPC-related changes in<br>&gt;compiler versions.<br>&gt;<br>&gt;-----Original Message-----<br>&gt;From: Stefan
Loos [mailto:user-dea24d965402@xymon.invalid]<br>&gt;Sent: Monday, July 11, 2005 7:12

AM<br>&gt;To: user-ae9b8668bcde@xymon.invalid<br>&gt;Subject: Re: [hobbit] Status
Unavailable - again<br>&gt;<br>&gt;Hi Henrik,<br>&gt;<br>&gt;we have several
own-written scripts (mostly in perl) which monitor<br>&gt;oracle<br>&gt;instances,
bea weblogic servers. There is one for hardware monitoring -<br>&gt;HP<br>&gt;(Intel) and Sun Servers (prtdiag based) and some are
just the output of<br>&gt;a<br>&gt;http request to the software running on
that weblogic servers.<br>&gt;I have just one server for testing, it's a HP
DL 360. We are running<br>&gt;Redhat<br>&gt;Enterprise Server 3 but I've

tried it with a SuSE 9.3 too.<br>&gt;<br>&gt;Regards,<br>&gt;<br>&gt;Stefan<br>&gt;<br>&gt;&lt;br
&gt;&lt;br&gt;&lt;br&gt;&amp;gt;From: user-ce4a2c883f75@xymon.invalid (Henrik Stoerner)&lt;br&gt;&amp;gt;Reply-To:<br>&gt;user-ae9b8668bcde@xymon.invalid&lt;br&gt;&amp
;gt;To: user-ae9b8668bcde@xymon.invalid&lt;br&gt;&amp;gt;Subject: Re: [hobbit]<br>&gt;Status<br>&gt;Unavailable - again&lt;br&gt;&amp;gt;Date:
quoted from Stefan Loos

Mon, 11 Jul 2005
13:08:59<br>&gt;+0200&lt;br&gt;&amp;gt;&lt;br&gt;&amp;gt;Hi Stefan,&lt;br&gt;&amp;gt;&lt;br&gt;&amp;gt;On Mon, Jul 11, 2005 at<br>&gt;09:36:42AM +0000, Stefan Loos wrote:&lt;br&gt;&amp;gt;
&amp;gt; It would be great if<br>&gt;you<br>&gt;can put me in cc. If you want I can
try to assist&lt;br&gt;&amp;gt; &amp;gt; you. I'm<br>&gt;at a<br>&gt;point
where I don't know what to try anymore. I think it&lt;br&gt;&amp;gt; &amp;gt;<br>&gt;isn't<br>&gt;easy for Henrik to find this issue - I have
no coredumps and<br>&gt;nothing&lt;br&gt;&amp;gt;<br>&gt;&amp;gt; in the logfile what could help.&lt;br&gt;&amp;gt;&lt;br&gt;&amp;gt;Yes, this is

a<br>&gt;really<br>&gt;nasty problem. Vernon and I though we had it&lt;br&gt;&amp;gt;nailed down by the<br>&gt;end<br>&gt;of last week,
but there's more to it than&lt;br&gt;&amp;gt;what we found<br>&gt;then.&lt;br&gt;&amp;gt;&lt;br&gt;&amp;gt;What kind of
external scripts run on your clients,<br>&gt;apart from the BB&lt;br&gt;&amp;gt;client ? The current suspicion is that this is<br>&gt;triggered by a status&lt;br&gt;&amp;gt;message that is handled

badly by Hobbit<br>&gt;causing<br>&gt;this lock-up. So I'm&lt;br&gt;&amp;gt;trying to see if there might be something in<br>&gt;<br>&gt;common between your setups.&lt;br&gt;&amp;gt;&lt;br&gt;&amp;gt;And what kind of system are<br>&gt;you<br>&gt;running Hobbit on ? If Linux, which&lt;br&gt;&amp;gt;distribution ? Another<br>&gt;suspicion<br>&gt;I
have is that this might be a&lt;br&gt;&amp;gt;problem with the implementation

of<br>&gt;<br>&gt;SysV IPC<br>&gt;semaphores.&lt;br&gt;&amp;gt;&lt;br&gt;&amp;gt;&lt;br&gt;&amp
;gt;Regards,&lt;br&gt;&amp;gt;Henrik&lt;br&gt;&amp;gt;&lt;br&gt;&amp;gt<
br>&gt;;&lt;br&gt;&amp;gt;To<br>&gt;unsubscribe from the hobbit list, send an e-mail<br>&gt;to&lt;br&gt;&amp;gt;user-095ef1c764a2@xymon.invalid&lt;br&gt;&a
mp;gt;&lt;br&gt;&amp;gt;&lt;br&gt;<br>&gt;<br>&gt;<br>&gt;<br>&gt;To unsubscribe from the hobbit list, send an e-mail to<br>&gt;user-095ef1c764a2@xymon.invalid<br>&gt;<br>&gt;<br>&gt;<br>&gt;<br>
&gt;<br>&gt;To unsubscribe from the hobbit list, send an e-mail to<br>&gt;user-095ef1c764a2@xymon.invalid<br>&gt;<br>&gt;<br>
list Henrik Størner · Mon, 11 Jul 2005 14:41:38 +0000 (UTC) ·
quoted from Stefan Loos
In <user-2d5f2349ef80@xymon.invalid> "Stefan Loos" <user-dea24d965402@xymon.invalid> writes:
do you think we see this issue because the kernel was built with a different compiler than I was using for building the hobbit server?
@Henrik - do you think this could be the problem?
No, I dont think so. Compiler versions should not matter, in the end it's just binary code.

One thing I do have in mind as a potential source of this problem is the
fact that newer Linux distributions tend to stuff all sorts of new
"scalability" features into their kernels and libc libraries. This could
mean that they come with versions of the kernel and/or libraries that have
bugs which Hobbit happens to trigger - some of the features that Hobbit
uses are not terribly common for applications, so there could be bugs that just haven't been discovered yet.

But for now, let's assume that the bug is in Hobbit (until we can prove otherwise).


Henrik