Hobbit crashes
list Christian Maxeiner
Hi all, I have just migrated from BB to Hobbit. I have run Hobbit beside my old BB installation on port 1985 for testing and configuring. Today I have shut down old BB and switched my Hobbit server and clients back to port 1984. Everything worked fine but an hour later some of the hobbit processes are crashing shortly after restarting hobbit. I am running hobbit 4.1.2p1 on HP-UX 11.11. This is the output of the hobbitlaunch.log file: ... 2006-02-01 16:51:27 hobbitlaunch starting 2006-02-01 16:51:27 Loading tasklist configuration from /users/hobbit4.0//server/etc/hobbitlaunch.cfg 2006-02-01 16:51:27 Loading hostnames 2006-02-01 16:51:27 Loading saved state 2006-02-01 16:51:27 Setting up network listener on 0.0.0.0:1984 2006-02-01 16:51:27 Setting up signal handlers 2006-02-01 16:51:27 Setting up hobbitd channels 2006-02-01 16:51:27 Setting up logfiles 2006-02-01 16:51:34 Task hobbitd terminated by signal 6 2006-02-01 16:51:34 Task bbretest terminated by signal 15 2006-02-01 16:51:34 Task bbdisplay terminated by signal 15 2006-02-01 16:51:34 Task clientdata terminated, status 1 2006-02-01 16:51:34 Task rrddata terminated, status 1 2006-02-01 16:51:34 Task rrdstatus terminated, status 1 2006-02-01 16:51:34 Task bbhistory terminated, status 1 2006-02-01 16:51:34 Task bbstatus terminated, status 1 2006-02-01 16:51:34 Loading hostnames 2006-02-01 16:51:34 Loading saved state 2006-02-01 16:51:34 Setting up network listener on 0.0.0.0:1984 2006-02-01 16:51:34 Setting up signal handlers 2006-02-01 16:51:34 Setting up hobbitd channels 2006-02-01 16:51:34 Setting up logfiles 2006-02-01 16:51:34 Task hobbitclient terminated by signal 15 2006-02-01 16:51:34 Task bbnet terminated by signal 15 2006-02-01 16:52:32 Task hobbitd terminated by signal 6 2006-02-01 16:52:32 Task bbretest terminated by signal 15 2006-02-01 16:52:32 Task clientdata terminated, status 1 2006-02-01 16:52:32 Task rrddata terminated, status 1 2006-02-01 16:52:32 Task rrdstatus terminated, status 1 2006-02-01 16:52:32 Task bbhistory terminated, status 1 2006-02-01 16:52:32 Task bbstatus terminated, status 1 2006-02-01 16:52:32 Loading hostnames 2006-02-01 16:52:32 Loading saved state 2006-02-01 16:52:32 Setting up network listener on 0.0.0.0:1984 2006-02-01 16:52:32 Setting up signal handlers 2006-02-01 16:52:32 Setting up hobbitd channels 2006-02-01 16:52:32 Setting up logfiles 2006-02-01 16:52:37 Task bbdisplay terminated by signal 15 2006-02-01 16:53:32 Task hobbitd terminated by signal 6 2006-02-01 16:53:32 Task bbretest terminated by signal 15 2006-02-01 16:53:32 Task clientdata terminated, status 1 2006-02-01 16:53:32 Task rrddata terminated, status 1 2006-02-01 16:53:32 Task rrdstatus terminated, status 1 2006-02-01 16:53:32 Task bbhistory terminated, status 1 2006-02-01 16:53:32 Task bbstatus terminated, status 1 2006-02-01 16:53:32 Task bbdisplay terminated by signal 15 2006-02-01 16:56:29 Task bb-swap terminated, status 5 2006-02-01 16:56:51 Task bb-httpbench terminated, status 5 2006-02-01 17:01:30 Task bb-swap terminated, status 5 2006-02-01 17:01:52 Task bb-httpbench terminated, status 5 2006-02-01 17:02:36 Loading hostnames 2006-02-01 17:02:36 Loading saved state 2006-02-01 17:02:36 Setting up network listener on 0.0.0.0:1984 2006-02-01 17:02:36 Setting up signal handlers 2006-02-01 17:02:36 Setting up hobbitd channels 2006-02-01 17:02:36 Setting up logfiles 2006-02-01 17:03:33 Task hobbitd terminated by signal 6 2006-02-01 17:03:34 Task clientdata terminated, status 1 2006-02-01 17:03:34 Task rrddata terminated, status 1 2006-02-01 17:03:34 Task rrdstatus terminated, status 1 2006-02-01 17:03:34 Task bbhistory terminated, status 1 2006-02-01 17:03:34 Task bbstatus terminated, status 1 2006-02-01 17:03:34 Loading hostnames 2006-02-01 17:03:34 Loading saved state 2006-02-01 17:03:34 Setting up network listener on 0.0.0.0:1984 2006-02-01 17:03:34 Setting up signal handlers 2006-02-01 17:03:34 Setting up hobbitd channels 2006-02-01 17:03:34 Setting up logfiles 2006-02-01 17:03:34 Task bbdisplay terminated by signal 15 Thanks in advance for your help. Chris
list Henrik Størner
▸
On Wed, Feb 01, 2006 at 05:05:38PM +0100, Maxeiner, Christian wrote:
Everything worked fine but an hour later some of the hobbit processes are crashing shortly after restarting hobbit. I am running hobbit 4.1.2p1 on HP-UX 11.11. This is the output of the hobbitlaunch.log file: 2006-02-01 16:51:27 Setting up hobbitd channels 2006-02-01 16:51:27 Setting up logfiles 2006-02-01 16:51:34 Task hobbitd terminated by signal 6
What's in the hobbitd.log file ? After the "Setting up logfiles", that's where the hobbitd output goes. This ought to generate a core-file in the ~hobbit/data/tmp/ directory. Could you check this, and if there is a core file then run it through gdb as described in http://www.hswn.dk/hobbit/help/known-issues.html#bugreport ? Thanks, Henrik
list Christian Maxeiner
Hi Henrik, these are the entries of hobbitd.log: 2006-02-01 16:02:18 Setup complete 2006-02-01 16:19:14 Setup complete 2006-02-01 16:19:20 Setup complete 2006-02-01 16:20:21 Setup complete 2006-02-01 16:30:23 Setup complete 2006-02-01 16:31:25 Setup complete 2006-02-01 16:32:24 Setup complete 2006-02-01 16:37:18 Setup complete 2006-02-01 16:37:25 Setup complete 2006-02-01 16:38:29 Setup complete 2006-02-01 16:42:00 Setup complete 2006-02-01 16:42:06 Setup complete 2006-02-01 16:43:10 Setup complete 2006-02-01 16:51:27 Setup complete 2006-02-01 16:51:34 Setup complete 2006-02-01 16:52:32 Setup complete 2006-02-01 17:02:36 Setup complete 2006-02-01 17:03:34 Setup complete 2006-02-01 17:04:34 Setup complete 2006-02-01 17:14:36 Setup complete Seams to be normal output. Output of gdb: Hewlett-Packard Wildebeest 1.0 (based on GDB 4.16) (built for PA-RISC 1.1 or 2.0 (narrow), HP-UX 11.00) Copyright 1996, 1997 Free Software Foundation, Inc... Core was generated by `hobbitd'. Program terminated with signal 6, Aborted. warning: The shared libraries were not privately mapped; setting a breakpoint in a shared library will not work until you rerun the program. warning: Can't find file hobbitd referenced in dld_list. #0 0xc020d5b8 in _kill () from /usr/lib/libc.2 #0 0xc020d5b8 in _kill () from /usr/lib/libc.2 (gdb) bt #0 0xc020d5b8 in _kill () from /usr/lib/libc.2 #1 0xc01a6f7c in raise () from /usr/lib/libc.2 #2 0xc01e81e0 in _abort_C () from /usr/lib/libc.2 #3 0xc01e823c in _abort () from /usr/lib/libc.2 #4 0x12bc8 in sigsegv_handler (signum=2063889120) at sig.c:57 #5 <signal handler called> #6 0xc0199038 in _sigfillset () from /usr/lib/libc.2 #7 0xc0195bec in _sscanf () from /usr/lib/libc.2 #8 0xc019b510 in realloc () from /usr/lib/libc.2 #9 0x104a8 in xrealloc (ptr=0x4010dffc, size=0) at memory.c:149 #10 0x8aac in do_message (msg=0x3c610c68, origin=0x0) at hobbitd.c:2222 #11 0xc17c in main (argc=10485759, argv=0x40009cb8) at hobbitd.c:3512 #12 0xc0143478 in _start () from /usr/lib/libc.2 (gdb) Thanks, Chris
▸
-----Ursprüngliche Nachricht-----
Von: Henrik Stoerner [mailto:user-ce4a2c883f75@xymon.invalid]
Gesendet: Mittwoch, 1. Februar 2006 17:15
An: user-ae9b8668bcde@xymon.invalid
Betreff: Re: [hobbit] Hobbit crashes
On Wed, Feb 01, 2006 at 05:05:38PM +0100, Maxeiner, Christian wrote:Everything worked fine but an hour later some of the hobbit processes are crashing shortly after restarting hobbit. I am running hobbit 4.1.2p1 on HP-UX 11.11. This is the output of the hobbitlaunch.log file: 2006-02-01 16:51:27 Setting up hobbitd channels 2006-02-01 16:51:27 Setting up logfiles 2006-02-01 16:51:34 Task hobbitd terminated by signal 6
What's in the hobbitd.log file ? After the "Setting up logfiles", that's where the hobbitd output goes. This ought to generate a core-file in the ~hobbit/data/tmp/ directory. Could you check this, and if there is a core file then run it through gdb as described in http://www.hswn.dk/hobbit/help/known-issues.html#bugreport ? Thanks, Henrik
list Henrik Størner
On Wed, Feb 01, 2006 at 05:18:20PM +0100, Maxeiner, Christian wrote:
Output of gdb: (gdb) bt
▸
#5 <signal handler called>
#6 0xc0199038 in _sigfillset () from /usr/lib/libc.2
#7 0xc0195bec in _sscanf () from /usr/lib/libc.2
#8 0xc019b510 in realloc () from /usr/lib/libc.2
#9 0x104a8 in xrealloc (ptr=0x4010dffc, size=0) at memory.c:149
#10 0x8aac in do_message (msg=0x3c610c68, origin=0x0) at hobbitd.c:2222
#11 0xc17c in main (argc=10485759, argv=0x40009cb8) at hobbitd.c:3512Very odd. The interesting thing is that hobbitd here is doing a re-allocation of a buffer, but asking for 0 bytes - and apparently, HP-UX doesn't like that. But I don't see how it can get to asking for 0 bytes in that part of the code... Could you start gdb again, but instead of the "bt" command do this: gdb> fr 10 gdb> p used gdb> p needed gdb> p bufsz gdb> p bufp gdb> p buf and mail me the output? Thanks, Henrik
list Christian Maxeiner
Henrik, here's the output:
▸
#0 0xc020d5b8 in _kill () from /usr/lib/libc.2
#0 0xc020d5b8 in _kill () from /usr/lib/libc.2
(gdb) fr 10
#10 0x8aac in do_message (msg=0x6e651a64, origin=0x0) at hobbitd.c:2222
2222 buf = (char *)realloc(buf, bufsz);
(gdb) p used
$1 = 1074320844
(gdb) p needed
$2 = 1024
(gdb) p bufsz
$3 = 30832
(gdb) p bufp
$4 = (char *) 0x0
(gdb) p buf
$5 = (char *) 0x8c70 "\b\034\002X\204`!0\013\205\n%4\023"
▸
(gdb)
Thanks, Chris.
-----Ursprüngliche Nachricht-----
Von: Henrik Stoerner [mailto:user-ce4a2c883f75@xymon.invalid]
Gesendet: Mittwoch, 1. Februar 2006 19:29
An: user-ae9b8668bcde@xymon.invalid
Betreff: Re: [hobbit] Hobbit crashes
On Wed, Feb 01, 2006 at 05:18:20PM +0100, Maxeiner, Christian wrote:Output of gdb: (gdb) bt #5 <signal handler called> #6 0xc0199038 in _sigfillset () from /usr/lib/libc.2 #7 0xc0195bec in _sscanf () from /usr/lib/libc.2 #8 0xc019b510 in realloc () from /usr/lib/libc.2 #9 0x104a8 in xrealloc (ptr=0x4010dffc, size=0) at memory.c:149 #10 0x8aac in do_message (msg=0x3c610c68, origin=0x0) at hobbitd.c:2222 #11 0xc17c in main (argc=10485759, argv=0x40009cb8) at hobbitd.c:3512
Very odd. The interesting thing is that hobbitd here is doing a re-allocation of a buffer, but asking for 0 bytes - and apparently, HP-UX doesn't like that. But I don't see how it can get to asking for 0 bytes in that part of the code... Could you start gdb again, but instead of the "bt" command do this: gdb> fr 10 gdb> p used gdb> p needed gdb> p bufsz gdb> p bufp gdb> p buf and mail me the output? Thanks, Henrik
list Henrik Størner
▸
On Wed, Feb 01, 2006 at 07:34:35PM +0100, Maxeiner, Christian wrote:
Henrik, here's the output: #0 0xc020d5b8 in _kill () from /usr/lib/libc.2 #0 0xc020d5b8 in _kill () from /usr/lib/libc.2 (gdb) fr 10 #10 0x8aac in do_message (msg=0x6e651a64, origin=0x0) at hobbitd.c:2222 2222 buf = (char *)realloc(buf, bufsz); (gdb) p used $1 = 1074320844 (gdb) p needed $2 = 1024 (gdb) p bufsz $3 = 30832 (gdb) p bufp $4 = (char *) 0x0 (gdb) p buf $5 = (char *) 0x8c70 "\b\034\002X\204`!0\013\205\n%4\023"
Still very odd. The "used" number is insanely high, and bufp is 0. The first could be a result of the latter, but I cannot understand how that could happen (bufp starts out being the same as buf, and it only grows). What happens if you shutdown Hobbit, rename the ~hobbit/data/tmp/hobbitd.chk file to something else, and startup Hobbit again ? If that makes it work, I'd very much like to have a copy of that file. It will include a lot of internal information about your tests, so make sure you have permission to make it available to me. And send it to me directly - user-ce4a2c883f75@xymon.invalid - instead of to the list. Regards, Henrik