Xymon Mailing List Archive search

still crashing

9 messages in this thread

list Rob Munsch · Thu, 08 Feb 2007 16:00:47 -0500 ·
I still have a constantly red-then-purple hobbitd_client on my hobbit server.

It's gotten to the point where i have a cron job dropping the test continuously.  I would appreciate any insight as to why this started happening and what is causing it.

Core was generated by `hobbitd_client'.
Program terminated with signal 6, Aborted.
#0  0xffffe410 in __kernel_vsyscall ()

Full output follows.

TIA.

hobbit at randomaccess ~/server $ gdb bin/hobbitd_client tmp/core
GNU gdb 6.6
Copyright (C) 2006 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "i686-pc-linux-gnu"...
Really redefine built-in command "frame"? (y or n) [answered Y; input not from terminal]
Really redefine built-in command "thread"? (y or n) [answered Y; input not from terminal]
Really redefine built-in command "start"? (y or n) [answered Y; input not from terminal]
Using host libthread_db library "/lib/libthread_db.so.1".

warning: Can't read pathname for load map: Input/output error.
Reading symbols from /usr/lib/libpcre.so.0...done.
Loaded symbols for /usr/lib/libpcre.so.0
Reading symbols from /lib/libc.so.6...done.
Loaded symbols for /lib/libc.so.6
Reading symbols from /lib/ld-linux.so.2...done.
Loaded symbols for /lib/ld-linux.so.2
Core was generated by `hobbitd_client'.
Program terminated with signal 6, Aborted.
#0  0xffffe410 in __kernel_vsyscall ()
gdb>
list Henrik Størner · Fri, 9 Feb 2007 12:51:58 +0100 ·
quoted from Rob Munsch
On Thu, Feb 08, 2007 at 04:00:47PM -0500, Rob Munsch wrote:
I still have a constantly red-then-purple hobbitd_client on my hobbit 
server.

It's gotten to the point where i have a cron job dropping the test 
continuously.  I would appreciate any insight as to why this started 
happening and what is causing it.

Core was generated by `hobbitd_client'.
Program terminated with signal 6, Aborted.
#0  0xffffe410 in __kernel_vsyscall ()
Unfortunately this doesn't give a clue about what actually happened,
except that it jumped to some wild address and crashed.

Could you add this line to hobbitd/hobbitd_client.c 
   dbgprintf("Client report from host %s\n", (hostname ? hostname : "<unknown>"));
around line 1754, just after the
    enum ostype_t os;
    namelist_t *hinfo = NULL;
lines. Then run "make" to rebuild hobbitd_client, copy the
hobbitd/hobbitd_client binary into ~hobbit/server/bin/ and edit
hobbitlaunch.cfg to include a "--debug" on the hobbitd_client command
(AFTER "hobbitd_client", ie at the end of the line).

hobbitd_client should restart automatically, and will be logging quite a
bit of data to the clientdata.log file, including the hosts that send it
data. This should let you figure out which host is sending the data that
triggers the crash, by comparing the time of the crash with the
timestamps in the logfile, or at least narrow it down.

Once you know which host it is, it would be interesting to see the
message this host sends. You can grab it from the "client data" link
for this host on the Hobbit web display. I'm obviously interested in
this message (please save it to a file instead of pasting it into an
e-mail), and also in the bb-hosts entry for the host, and any setup in
hobbit-clients.cfg.


Regards,
Henrik
list Rob Munsch · Fri, 09 Feb 2007 14:05:39 -0500 ·
quoted from Henrik Størner
Henrik Stoerner wrote:
On Thu, Feb 08, 2007 at 04:00:47PM -0500, Rob Munsch wrote:
I still have a constantly red-then-purple hobbitd_client on my hobbit server.

It's gotten to the point where i have a cron job dropping the test continuously.  I would appreciate any insight as to why this started happening and what is causing it.

Core was generated by `hobbitd_client'.
Program terminated with signal 6, Aborted.
#0  0xffffe410 in __kernel_vsyscall ()
Unfortunately this doesn't give a clue about what actually happened,
except that it jumped to some wild address and crashed.
Figures.  Okay, i'm in the middle of a server deployment at the moment, but i will follow through with this early next week and see what I can dig up for you.  Thanks for the response!
quoted from Henrik Størner
Could you add this line to hobbitd/hobbitd_client.c    dbgprintf("Client report from host %s\n", (hostname ? hostname : "<unknown>"));
around line 1754, just after the
    enum ostype_t os;
    namelist_t *hinfo = NULL;
lines. Then run "make" to rebuild hobbitd_client, copy the
hobbitd/hobbitd_client binary into ~hobbit/server/bin/ and edit
hobbitlaunch.cfg to include a "--debug" on the hobbitd_client command
(AFTER "hobbitd_client", ie at the end of the line).

hobbitd_client should restart automatically, and will be logging quite a
bit of data to the clientdata.log file, including the hosts that send it
data. This should let you figure out which host is sending the data that
triggers the crash, by comparing the time of the crash with the
timestamps in the logfile, or at least narrow it down.

Once you know which host it is, it would be interesting to see the
message this host sends. You can grab it from the "client data" link
for this host on the Hobbit web display. I'm obviously interested in
this message (please save it to a file instead of pasting it into an
e-mail), and also in the bb-hosts entry for the host, and any setup in
hobbit-clients.cfg.


Regards,
Henrik

list Rob Munsch · Wed, 14 Feb 2007 16:08:11 -0500 ·
quoted from Rob Munsch
Henrik Stoerner wrote:
On Thu, Feb 08, 2007 at 04:00:47PM -0500, Rob Munsch wrote:
I still have a constantly red-then-purple hobbitd_client on my hobbit 
server.

It's gotten to the point where i have a cron job dropping the test 
continuously.  I would appreciate any insight as to why this started 
happening and what is causing it.

Core was generated by `hobbitd_client'.
Program terminated with signal 6, Aborted.
#0  0xffffe410 in __kernel_vsyscall ()
Unfortunately this doesn't give a clue about what actually happened,
except that it jumped to some wild address and crashed.

Could you add this line to hobbitd/hobbitd_client.c 
   dbgprintf("Client report from host %s\n", (hostname ? hostname : "<unknown>"));
around line 1754, just after the
    enum ostype_t os;
    namelist_t *hinfo = NULL;
lines. Then run "make" to rebuild hobbitd_client, copy the
I tried doing this.  The make bombed terribly; pages and pages of 
errors.  It started like this:

root at randomaccess ~/hobbit-4.2.0/hobbitd # make
cc  -c -o hobbitd_client.o hobbitd_client.c
hobbitd_client.c:26:22: error: libbbgen.h: No such file or directory
In file included from hobbitd_client.c:28:
client_config.h:23: error: expected ')' before '*' token
client_config.h:27: error: expected ')' before '*' token
client_config.h:33: error: expected ')' before '*' token
client_config.h:38: error: expected ')' before '*' token
client_config.h:40: error: expected ')' before '*' token
client_config.h:43: error: expected ')' before '*' token
client_config.h:47: error: expected ')' before '*' token
client_config.h:51: error: expected ')' before '*' token
client_config.h:55: error: expected ')' before '*' token
hobbitd_client.c:46: error: 'COL_CLEAR' undeclared here (not in a function)
hobbitd_client.c:132: error: expected ')' before '*' token
hobbitd_client.c:165: error: expected declaration specifiers or '...' 
before 'namelist_t'

I copied the line you gave me from this email, where specified, so i 
don't think it's that.

rob
list Rich Smrcina · Wed, 14 Feb 2007 15:13:19 -0600 ·
Go back a level (cd ..) and try it again.  It happens to me alot! :)
quoted from Rob Munsch

Rob Munsch wrote:
Henrik Stoerner wrote:
On Thu, Feb 08, 2007 at 04:00:47PM -0500, Rob Munsch wrote:
I still have a constantly red-then-purple hobbitd_client on my hobbit 
server.

It's gotten to the point where i have a cron job dropping the test 
continuously.  I would appreciate any insight as to why this started 
happening and what is causing it.

Core was generated by `hobbitd_client'.
Program terminated with signal 6, Aborted.
#0  0xffffe410 in __kernel_vsyscall ()
Unfortunately this doesn't give a clue about what actually happened,
except that it jumped to some wild address and crashed.

Could you add this line to hobbitd/hobbitd_client.c    
dbgprintf("Client report from host %s\n", (hostname ? hostname : 
"<unknown>"));
around line 1754, just after the
    enum ostype_t os;
    namelist_t *hinfo = NULL;
lines. Then run "make" to rebuild hobbitd_client, copy the
I tried doing this.  The make bombed terribly; pages and pages of 
errors.  It started like this:

root at randomaccess ~/hobbit-4.2.0/hobbitd # make
cc  -c -o hobbitd_client.o hobbitd_client.c
hobbitd_client.c:26:22: error: libbbgen.h: No such file or directory
In file included from hobbitd_client.c:28:
client_config.h:23: error: expected ')' before '*' token
client_config.h:27: error: expected ')' before '*' token
client_config.h:33: error: expected ')' before '*' token
client_config.h:38: error: expected ')' before '*' token
client_config.h:40: error: expected ')' before '*' token
client_config.h:43: error: expected ')' before '*' token
client_config.h:47: error: expected ')' before '*' token
client_config.h:51: error: expected ')' before '*' token
client_config.h:55: error: expected ')' before '*' token
hobbitd_client.c:46: error: 'COL_CLEAR' undeclared here (not in a function)
hobbitd_client.c:132: error: expected ')' before '*' token
hobbitd_client.c:165: error: expected declaration specifiers or '...' 
before 'namelist_t'

I copied the line you gave me from this email, where specified, so i 
don't think it's that.

rob

-- 

Rich Smrcina
VM Assist, Inc.
Phone: XXX-XXX-XXXX
Ans Service:  XXX-XXX-XXXX
user-61add9955ef9@xymon.invalid

Catch the WAVV!  http://www.wavv.org
WAVV 2007 - Green Bay, WI - May 18-22, 2007
list Rob Munsch · Wed, 14 Feb 2007 16:53:45 -0500 ·
quoted from Rich Smrcina
Rich Smrcina wrote:
Go back a level (cd ..) and try it again.  It happens to me alot! :)
Marvelously embarrassing.  Thanks, proceeding with requested tests...
sigh
quoted from Rob Munsch
Rob Munsch wrote:
Henrik Stoerner wrote:
On Thu, Feb 08, 2007 at 04:00:47PM -0500, Rob Munsch wrote:
I still have a constantly red-then-purple hobbitd_client on my hobbit server.

It's gotten to the point where i have a cron job dropping the test continuously.  I would appreciate any insight as to why this started happening and what is causing it.

Core was generated by `hobbitd_client'.
Program terminated with signal 6, Aborted.
#0  0xffffe410 in __kernel_vsyscall ()
Unfortunately this doesn't give a clue about what actually happened,
except that it jumped to some wild address and crashed.

Could you add this line to hobbitd/hobbitd_client.c    dbgprintf("Client report from host %s\n", (hostname ? hostname : "<unknown>"));
around line 1754, just after the
    enum ostype_t os;
    namelist_t *hinfo = NULL;
lines. Then run "make" to rebuild hobbitd_client, copy the
I tried doing this.  The make bombed terribly; pages and pages of errors.  It started like this:

root at randomaccess ~/hobbit-4.2.0/hobbitd # make
cc  -c -o hobbitd_client.o hobbitd_client.c
hobbitd_client.c:26:22: error: libbbgen.h: No such file or directory
In file included from hobbitd_client.c:28:
client_config.h:23: error: expected ')' before '*' token
client_config.h:27: error: expected ')' before '*' token
client_config.h:33: error: expected ')' before '*' token
client_config.h:38: error: expected ')' before '*' token
client_config.h:40: error: expected ')' before '*' token
client_config.h:43: error: expected ')' before '*' token
client_config.h:47: error: expected ')' before '*' token
client_config.h:51: error: expected ')' before '*' token
client_config.h:55: error: expected ')' before '*' token
hobbitd_client.c:46: error: 'COL_CLEAR' undeclared here (not in a function)
hobbitd_client.c:132: error: expected ')' before '*' token
hobbitd_client.c:165: error: expected declaration specifiers or '...' before 'namelist_t'

I copied the line you gave me from this email, where specified, so i don't think it's that.

rob

list Rob Munsch · Mon, 26 Feb 2007 14:51:54 -0500 ·
Henrik,

I haven't been able to pinpoint a specific message at the same time the 
hobbitd_client dies.  What i am seeing are blocks of things like this:

2007-02-26 09:56:52 Worker process died with exit code 134, terminating
2007-02-26 10:16:54 Worker process died with exit code 134, terminating
2007-02-26 10:16:55 Worker process died with exit code 134, terminating
2007-02-26 10:26:56 Worker process died with exit code 134, terminating
2007-02-26 10:26:56 Worker process died with exit code 134, terminating
2007-02-26 12:17:07 Worker process died with exit code 134, terminating
2007-02-26 12:17:11 Worker process died with exit code 134, terminating
2007-02-26 12:42:10 Worker process died with exit code 134, terminating
2007-02-26 12:42:14 Worker process died with exit code 134, terminating
2007-02-26 13:02:13 Worker process died with exit code 134, terminating
2007-02-26 13:02:17 Worker process died with exit code 134, terminating
2007-02-26 13:07:13 Worker process died with exit code 134, terminating
2007-02-26 13:07:18 Worker process died with exit code 134, terminating
2007-02-26 13:17:19 Worker process died with exit code 134, terminating
2007-02-26 13:22:20 Worker process died with exit code 134, terminating
2007-02-26 13:22:20 Worker process died with exit code 134, terminating
2007-02-26 13:27:20 Worker process died with exit code 134, terminating
2007-02-26 13:27:20 Worker process died with exit code 134, terminating
2007-02-26 13:32:21 Worker process died with exit code 134, terminating
2007-02-26 13:42:22 Worker process died with exit code 134, terminating
2007-02-26 13:42:22 Worker process died with exit code 134, terminating
2007-02-26 13:52:24 Worker process died with exit code 134, terminating
2007-02-26 13:52:24 Worker process died with exit code 134, terminating
2007-02-26 14:07:26 Worker process died with exit code 134, terminating
2007-02-26 14:07:26 Worker process died with exit code 134, terminating

I have it running in --debug mode as per your suggestion, and am getting 
a ton of output: i have a feeling it's a little more than i'm capable of 
sorting through well :(.

The only other oddity is it occasionally barfs on Disk tests.  For no 
apparent reason i get

2007-02-26 09:31:49 Host grape (linux) sent incomprehensible disk report 
- missing columnheaders 'Capacity' and 'Mounted'

but by the next poll, it's figured it out again.  i don't know if these 
are related, but it's all I've got right now.

I'll keep trying to correlate a specific message with the crash time and 
let you know what i find out.
quoted from Rob Munsch

Rob Munsch wrote:
Rich Smrcina wrote:
Go back a level (cd ..) and try it again.  It happens to me alot! :)
Marvelously embarrassing.  Thanks, proceeding with requested tests...
sigh
Rob Munsch wrote:
Henrik Stoerner wrote:
On Thu, Feb 08, 2007 at 04:00:47PM -0500, Rob Munsch wrote:
I still have a constantly red-then-purple hobbitd_client on my 
hobbit server.

It's gotten to the point where i have a cron job dropping the test 
continuously.  I would appreciate any insight as to why this 
started happening and what is causing it.

Core was generated by `hobbitd_client'.
Program terminated with signal 6, Aborted.
#0  0xffffe410 in __kernel_vsyscall ()
Unfortunately this doesn't give a clue about what actually happened,
except that it jumped to some wild address and crashed.

Could you add this line to hobbitd/hobbitd_client.c    
dbgprintf("Client report from host %s\n", (hostname ? hostname : 
"<unknown>"));
around line 1754, just after the
    enum ostype_t os;
    namelist_t *hinfo = NULL;
lines. Then run "make" to rebuild hobbitd_client, copy the
I tried doing this.  The make bombed terribly; pages and pages of 
errors.  It started like this:

root at randomaccess ~/hobbit-4.2.0/hobbitd # make
cc  -c -o hobbitd_client.o hobbitd_client.c
hobbitd_client.c:26:22: error: libbbgen.h: No such file or directory
In file included from hobbitd_client.c:28:
client_config.h:23: error: expected ')' before '*' token
client_config.h:27: error: expected ')' before '*' token
client_config.h:33: error: expected ')' before '*' token
client_config.h:38: error: expected ')' before '*' token
client_config.h:40: error: expected ')' before '*' token
client_config.h:43: error: expected ')' before '*' token
client_config.h:47: error: expected ')' before '*' token
client_config.h:51: error: expected ')' before '*' token
client_config.h:55: error: expected ')' before '*' token
hobbitd_client.c:46: error: 'COL_CLEAR' undeclared here (not in a 
function)
hobbitd_client.c:132: error: expected ')' before '*' token
hobbitd_client.c:165: error: expected declaration specifiers or '...' 
before 'namelist_t'

I copied the line you gave me from this email, where specified, so i 
don't think it's that.

rob

list Rich Smrcina · Mon, 26 Feb 2007 14:13:27 -0600 ·
Also, if possible try to capture the offending disk report.  Check the 
good report and the bad one to see if the reporting IP addresses are 
different. It is possible that two machines are reporting with the same 
hostname.

I've seen the 'Worker process died' message when I really screwed up 
something in the client coding.  It likely means that something in the 
client message is out of place, which makes sense given the message you 
see about the disk report.
quoted from Rob Munsch

Rob Munsch wrote:
Henrik,

I haven't been able to pinpoint a specific message at the same time the 
hobbitd_client dies.  What i am seeing are blocks of things like this:

2007-02-26 09:56:52 Worker process died with exit code 134, terminating
2007-02-26 10:16:54 Worker process died with exit code 134, terminating
2007-02-26 10:16:55 Worker process died with exit code 134, terminating
2007-02-26 10:26:56 Worker process died with exit code 134, terminating
2007-02-26 10:26:56 Worker process died with exit code 134, terminating
2007-02-26 12:17:07 Worker process died with exit code 134, terminating
2007-02-26 12:17:11 Worker process died with exit code 134, terminating
2007-02-26 12:42:10 Worker process died with exit code 134, terminating
2007-02-26 12:42:14 Worker process died with exit code 134, terminating
2007-02-26 13:02:13 Worker process died with exit code 134, terminating
2007-02-26 13:02:17 Worker process died with exit code 134, terminating
2007-02-26 13:07:13 Worker process died with exit code 134, terminating
2007-02-26 13:07:18 Worker process died with exit code 134, terminating
2007-02-26 13:17:19 Worker process died with exit code 134, terminating
2007-02-26 13:22:20 Worker process died with exit code 134, terminating
2007-02-26 13:22:20 Worker process died with exit code 134, terminating
2007-02-26 13:27:20 Worker process died with exit code 134, terminating
2007-02-26 13:27:20 Worker process died with exit code 134, terminating
2007-02-26 13:32:21 Worker process died with exit code 134, terminating
2007-02-26 13:42:22 Worker process died with exit code 134, terminating
2007-02-26 13:42:22 Worker process died with exit code 134, terminating
2007-02-26 13:52:24 Worker process died with exit code 134, terminating
2007-02-26 13:52:24 Worker process died with exit code 134, terminating
2007-02-26 14:07:26 Worker process died with exit code 134, terminating
2007-02-26 14:07:26 Worker process died with exit code 134, terminating

I have it running in --debug mode as per your suggestion, and am getting 
a ton of output: i have a feeling it's a little more than i'm capable of 
sorting through well :(.

The only other oddity is it occasionally barfs on Disk tests.  For no 
apparent reason i get

2007-02-26 09:31:49 Host grape (linux) sent incomprehensible disk report 
- missing columnheaders 'Capacity' and 'Mounted'

but by the next poll, it's figured it out again.  i don't know if these 
are related, but it's all I've got right now.

I'll keep trying to correlate a specific message with the crash time and 
let you know what i find out.

Rob Munsch wrote:
Rich Smrcina wrote:
Go back a level (cd ..) and try it again.  It happens to me alot! :)
Marvelously embarrassing.  Thanks, proceeding with requested tests...
sigh
Rob Munsch wrote:
Henrik Stoerner wrote:
On Thu, Feb 08, 2007 at 04:00:47PM -0500, Rob Munsch wrote:
I still have a constantly red-then-purple hobbitd_client on my 
hobbit server.

It's gotten to the point where i have a cron job dropping the test 
continuously.  I would appreciate any insight as to why this 
started happening and what is causing it.

Core was generated by `hobbitd_client'.
Program terminated with signal 6, Aborted.
#0  0xffffe410 in __kernel_vsyscall ()
Unfortunately this doesn't give a clue about what actually happened,
except that it jumped to some wild address and crashed.

Could you add this line to hobbitd/hobbitd_client.c    
dbgprintf("Client report from host %s\n", (hostname ? hostname : 
"<unknown>"));
around line 1754, just after the
    enum ostype_t os;
    namelist_t *hinfo = NULL;
lines. Then run "make" to rebuild hobbitd_client, copy the
I tried doing this.  The make bombed terribly; pages and pages of 
errors.  It started like this:

root at randomaccess ~/hobbit-4.2.0/hobbitd # make
cc  -c -o hobbitd_client.o hobbitd_client.c
hobbitd_client.c:26:22: error: libbbgen.h: No such file or directory
In file included from hobbitd_client.c:28:
client_config.h:23: error: expected ')' before '*' token
client_config.h:27: error: expected ')' before '*' token
client_config.h:33: error: expected ')' before '*' token
client_config.h:38: error: expected ')' before '*' token
client_config.h:40: error: expected ')' before '*' token
client_config.h:43: error: expected ')' before '*' token
client_config.h:47: error: expected ')' before '*' token
client_config.h:51: error: expected ')' before '*' token
client_config.h:55: error: expected ')' before '*' token
hobbitd_client.c:46: error: 'COL_CLEAR' undeclared here (not in a 
function)
hobbitd_client.c:132: error: expected ')' before '*' token
hobbitd_client.c:165: error: expected declaration specifiers or 
'...' before 'namelist_t'

I copied the line you gave me from this email, where specified, so i 
don't think it's that.

rob

-- 
Rich Smrcina
VM Assist, Inc.
Phone: XXX-XXX-XXXX
Ans Service:  XXX-XXX-XXXX
user-61add9955ef9@xymon.invalid

Catch the WAVV!  http://www.wavv.org
WAVV 2007 - Green Bay, WI - May 18-22, 2007
list Rob Munsch · Wed, 28 Feb 2007 14:52:59 -0500 ·
Here's (attached as plaintext) an offending report ("client status.")  note that for df, we have top output (huh?!) and hobbit complains, quite rightly, that it can't make head or tail (so to speak) of disk space from that.
quoted from Rich Smrcina

Rich Smrcina wrote:
Also, if possible try to capture the offending disk report.  Check the good report and the bad one to see if the reporting IP addresses are different. It is possible that two machines are reporting with the same hostname.

I've seen the 'Worker process died' message when I really screwed up something in the client coding.  It likely means that something in the client message is out of place, which makes sense given the message you see about the disk report.

Rob Munsch wrote:
Henrik,

I haven't been able to pinpoint a specific message at the same time the hobbitd_client dies.  What i am seeing are blocks of things like this:

2007-02-26 09:56:52 Worker process died with exit code 134, terminating
2007-02-26 10:16:54 Worker process died with exit code 134, terminating
2007-02-26 10:16:55 Worker process died with exit code 134, terminating
2007-02-26 10:26:56 Worker process died with exit code 134, terminating
2007-02-26 10:26:56 Worker process died with exit code 134, terminating
2007-02-26 12:17:07 Worker process died with exit code 134, terminating
2007-02-26 12:17:11 Worker process died with exit code 134, terminating
2007-02-26 12:42:10 Worker process died with exit code 134, terminating
2007-02-26 12:42:14 Worker process died with exit code 134, terminating
2007-02-26 13:02:13 Worker process died with exit code 134, terminating
2007-02-26 13:02:17 Worker process died with exit code 134, terminating
2007-02-26 13:07:13 Worker process died with exit code 134, terminating
2007-02-26 13:07:18 Worker process died with exit code 134, terminating
2007-02-26 13:17:19 Worker process died with exit code 134, terminating
2007-02-26 13:22:20 Worker process died with exit code 134, terminating
2007-02-26 13:22:20 Worker process died with exit code 134, terminating
2007-02-26 13:27:20 Worker process died with exit code 134, terminating
2007-02-26 13:27:20 Worker process died with exit code 134, terminating
2007-02-26 13:32:21 Worker process died with exit code 134, terminating
2007-02-26 13:42:22 Worker process died with exit code 134, terminating
2007-02-26 13:42:22 Worker process died with exit code 134, terminating
2007-02-26 13:52:24 Worker process died with exit code 134, terminating
2007-02-26 13:52:24 Worker process died with exit code 134, terminating
2007-02-26 14:07:26 Worker process died with exit code 134, terminating
2007-02-26 14:07:26 Worker process died with exit code 134, terminating

I have it running in --debug mode as per your suggestion, and am getting a ton of output: i have a feeling it's a little more than i'm capable of sorting through well :(.

The only other oddity is it occasionally barfs on Disk tests.  For no apparent reason i get

2007-02-26 09:31:49 Host grape (linux) sent incomprehensible disk report - missing columnheaders 'Capacity' and 'Mounted'

but by the next poll, it's figured it out again.  i don't know if these are related, but it's all I've got right now.

I'll keep trying to correlate a specific message with the crash time and let you know what i find out.

Rob Munsch wrote:
Rich Smrcina wrote:
Go back a level (cd ..) and try it again.  It happens to me alot! :)
Marvelously embarrassing.  Thanks, proceeding with requested tests...
sigh
Rob Munsch wrote:
Henrik Stoerner wrote:
On Thu, Feb 08, 2007 at 04:00:47PM -0500, Rob Munsch wrote:
I still have a constantly red-then-purple hobbitd_client on my hobbit server.

It's gotten to the point where i have a cron job dropping the test continuously.  I would appreciate any insight as to why this started happening and what is causing it.

Core was generated by `hobbitd_client'.
Program terminated with signal 6, Aborted.
#0  0xffffe410 in __kernel_vsyscall ()
Unfortunately this doesn't give a clue about what actually happened,
except that it jumped to some wild address and crashed.

Could you add this line to hobbitd/hobbitd_client.c    dbgprintf("Client report from host %s\n", (hostname ? hostname : "<unknown>"));
around line 1754, just after the
    enum ostype_t os;
    namelist_t *hinfo = NULL;
lines. Then run "make" to rebuild hobbitd_client, copy the
I tried doing this.  The make bombed terribly; pages and pages of errors.  It started like this:

root at randomaccess ~/hobbit-4.2.0/hobbitd # make
cc  -c -o hobbitd_client.o hobbitd_client.c
hobbitd_client.c:26:22: error: libbbgen.h: No such file or directory
In file included from hobbitd_client.c:28:
client_config.h:23: error: expected ')' before '*' token
client_config.h:27: error: expected ')' before '*' token
client_config.h:33: error: expected ')' before '*' token
client_config.h:38: error: expected ')' before '*' token
client_config.h:40: error: expected ')' before '*' token
client_config.h:43: error: expected ')' before '*' token
client_config.h:47: error: expected ')' before '*' token
client_config.h:51: error: expected ')' before '*' token
client_config.h:55: error: expected ')' before '*' token
hobbitd_client.c:46: error: 'COL_CLEAR' undeclared here (not in a function)
hobbitd_client.c:132: error: expected ')' before '*' token
hobbitd_client.c:165: error: expected declaration specifiers or '...' before 'namelist_t'

I copied the line you gave me from this email, where specified, so i don't think it's that.

rob

Attachments (1)