Xymon Mailing List Archive search

coredump on hobbitfetch

8 messages in this thread

list Olivier Beau · Tue, 13 Jan 2009 13:39:59 +0100 ·
$ /data/hobbit/server$ gdb bin/hobbitfetch  tmp/core
GNU gdb 6.4.90-debian
Copyright (C) 2006 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain 
conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "i486-linux-gnu"...Using host libthread_db 
library "/lib/tls/i686/cmov/libthread_db.so.1".


warning: Can't read pathname for load map: Input/output error.
Reading symbols from /usr/lib/libz.so.1...done.
Loaded symbols for /usr/lib/libz.so.1
Reading symbols from /lib/tls/i686/cmov/libc.so.6...done.
Loaded symbols for /lib/tls/i686/cmov/libc.so.6
Reading symbols from /lib/ld-linux.so.2...done.
Loaded symbols for /lib/ld-linux.so.2
Core was generated by `/data/hobbit/server/bin/hobbitfetch 
--server=127.0.0.1 --no-daemon --pidfile=/d'.
Program terminated with signal 6, Aborted.
#0  0xb7f4f410 in ?? ()
(gdb) bt
#0  0xb7f4f410 in ?? ()
#1  0xbfc2ed9c in ?? ()
#2  0x00000006 in ?? ()
#3  0x0000734c in ?? ()
#4  0xb7e28811 in raise () from /lib/tls/i686/cmov/libc.so.6
#5  0xb7e29fb9 in abort () from /lib/tls/i686/cmov/libc.so.6
#6  0x08051a01 in sigsegv_handler (signum=11) at sig.c:58
#7  0xb7f4f420 in ?? ()
#8  0x0000000b in ?? ()
#9  0x00000033 in ?? ()
#10 0x00000000 in ?? ()
(gdb)
list Gatis A. · Mon, 26 Jan 2009 10:10:36 +0200 ·
Hi,

I got the same problem on RHEL5.2 with kernel 2.6.18-92.1.18.el5,
hobbitfetch goes purple with "- - Program crashed Fatal signal caught!"
and dumps core:

[hob at localhost tmp]$ file core.19489
core.19489: ELF 32-bit LSB core file Intel 80386, version 1 (SYSV),
SVR4-style, from 'hobbitfetch'

[hob at localhost tmp]$ gdb ../bin/hobbitfetch core.19489
GNU gdb Red Hat Linux (6.5-37.el5_2.2rh)
quoted from Olivier Beau
Copyright (C) 2006 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain
conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.

This GDB was configured as "i386-redhat-linux-gnu"...Using host libthread_db
library "/lib/libthread_db.so.1".


warning: Can't read pathname for load map: Input/output error.
Reading symbols from /lib/libc.so.6...done.
Loaded symbols for /lib/libc.so.6
Reading symbols from /lib/ld-linux.so.2...done.
Loaded symbols for /lib/ld-linux.so.2
Core was generated by `/home/hob/xymon/server/bin/hobbitfetch
--server=127.0.0.1 --no-daemon --pid'.
Program terminated with signal 6, Aborted.
#0  0x00e83402 in __kernel_vsyscall ()
(gdb) quit
quoted from Olivier Beau


On Tue, Jan 13, 2009 at 2:39 PM, Olivier Beau <user-eb340192b6fc@xymon.invalid> wrote:
$ /data/hobbit/server$ gdb bin/hobbitfetch  tmp/core
GNU gdb 6.4.90-debian
Copyright (C) 2006 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you
are
welcome to change it and/or distribute copies of it under certain
conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "i486-linux-gnu"...Using host libthread_db
library "/lib/tls/i686/cmov/libthread_db.so.1".


warning: Can't read pathname for load map: Input/output error.
Reading symbols from /usr/lib/libz.so.1...done.
Loaded symbols for /usr/lib/libz.so.1
Reading symbols from /lib/tls/i686/cmov/libc.so.6...done.
Loaded symbols for /lib/tls/i686/cmov/libc.so.6
Reading symbols from /lib/ld-linux.so.2...done.
Loaded symbols for /lib/ld-linux.so.2
Core was generated by `/data/hobbit/server/bin/hobbitfetch
--server=127.0.0.1 --no-daemon --pidfile=/d'.
Program terminated with signal 6, Aborted.
#0  0xb7f4f410 in ?? ()
(gdb) bt
#0  0xb7f4f410 in ?? ()
#1  0xbfc2ed9c in ?? ()
#2  0x00000006 in ?? ()
#3  0x0000734c in ?? ()
#4  0xb7e28811 in raise () from /lib/tls/i686/cmov/libc.so.6
#5  0xb7e29fb9 in abort () from /lib/tls/i686/cmov/libc.so.6
#6  0x08051a01 in sigsegv_handler (signum=11) at sig.c:58
#7  0xb7f4f420 in ?? ()
#8  0x0000000b in ?? ()
#9  0x00000033 in ?? ()
#10 0x00000000 in ?? ()
(gdb)

list Gatis A. · Tue, 27 Jan 2009 10:01:11 +0200 ·
More info.

After the following messages in logfile hobbitfetch dumps core and goes red:

2009-01-27 08:33:30 Got 1380 bytes of data from 192.168.0.1:1984 (req 2388)
2009-01-27 08:33:30 Got 1380 bytes of data from 192.168.0.1:1984 (req 2388)
2009-01-27 08:33:30 Got 1380 bytes of data from 192.168.0.1:1984 (req 2389)
2009-01-27 08:33:30 Got 1380 bytes of data from 192.168.0.1:1984 (req 2389)
quoted from Gatis A.

Core was generated by `/home/hob/xymon/server/bin/hobbitfetch
--server=127.0.0.1 --no-daemon --pid'.
Program terminated with signal 6, Aborted.

#0  0x00495402 in __kernel_vsyscall ()
(gdb) backtrace
#0  0x00495402 in __kernel_vsyscall ()
#1  0x00501d10 in raise () from /lib/libc.so.6
#2  0x00503621 in abort () from /lib/libc.so.6
#3  0x08050ff3 in sigsegv_handler (signum=11) at sig.c:57
#4  <signal handler called>
#5  main (argc=5, argv=Cannot access memory at address 0x4
) at hobbitfetch.c:709
(gdb)

In addition column "hobbitfetch" goes "red", then "purple" and never returs
to "green" status.
It looks that hobbitfetch sends only "red" statuses. "purple" need to be
dropped manualy.
quoted from Gatis A.


On Mon, Jan 26, 2009 at 10:10 AM, Gatis A. <user-e47f4dceddb4@xymon.invalid> wrote:
Hi,

I got the same problem on RHEL5.2 with kernel 2.6.18-92.1.18.el5,
hobbitfetch goes purple with "- - Program crashed Fatal signal caught!"
and dumps core:

[hob at localhost tmp]$ file core.19489
core.19489: ELF 32-bit LSB core file Intel 80386, version 1 (SYSV),
SVR4-style, from 'hobbitfetch'

[hob at localhost tmp]$ gdb ../bin/hobbitfetch core.19489
GNU gdb Red Hat Linux (6.5-37.el5_2.2rh)
Copyright (C) 2006 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you
are
welcome to change it and/or distribute copies of it under certain
conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "i386-redhat-linux-gnu"...Using host
libthread_db library "/lib/libthread_db.so.1".


warning: Can't read pathname for load map: Input/output error.
Reading symbols from /lib/libc.so.6...done.
Loaded symbols for /lib/libc.so.6
Reading symbols from /lib/ld-linux.so.2...done.
Loaded symbols for /lib/ld-linux.so.2
Core was generated by `/home/hob/xymon/server/bin/hobbitfetch
--server=127.0.0.1 --no-daemon --pid'.
Program terminated with signal 6, Aborted.
#0  0x00e83402 in __kernel_vsyscall ()
(gdb) quit


On Tue, Jan 13, 2009 at 2:39 PM, Olivier Beau <user-eb340192b6fc@xymon.invalid> wrote:
$ /data/hobbit/server$ gdb bin/hobbitfetch  tmp/core
GNU gdb 6.4.90-debian
Copyright (C) 2006 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you
are
welcome to change it and/or distribute copies of it under certain
conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for
details.
This GDB was configured as "i486-linux-gnu"...Using host libthread_db
library "/lib/tls/i686/cmov/libthread_db.so.1".


warning: Can't read pathname for load map: Input/output error.
Reading symbols from /usr/lib/libz.so.1...done.
Loaded symbols for /usr/lib/libz.so.1
Reading symbols from /lib/tls/i686/cmov/libc.so.6...done.
Loaded symbols for /lib/tls/i686/cmov/libc.so.6
Reading symbols from /lib/ld-linux.so.2...done.
Loaded symbols for /lib/ld-linux.so.2
Core was generated by `/data/hobbit/server/bin/hobbitfetch
--server=127.0.0.1 --no-daemon --pidfile=/d'.
Program terminated with signal 6, Aborted.
#0  0xb7f4f410 in ?? ()
(gdb) bt
#0  0xb7f4f410 in ?? ()
#1  0xbfc2ed9c in ?? ()
#2  0x00000006 in ?? ()
#3  0x0000734c in ?? ()
#4  0xb7e28811 in raise () from /lib/tls/i686/cmov/libc.so.6
#5  0xb7e29fb9 in abort () from /lib/tls/i686/cmov/libc.so.6
#6  0x08051a01 in sigsegv_handler (signum=11) at sig.c:58
#7  0xb7f4f420 in ?? ()
#8  0x0000000b in ?? ()
#9  0x00000033 in ?? ()
#10 0x00000000 in ?? ()
(gdb)

list Henrik Størner · Wed, 28 Jan 2009 11:59:07 +0000 (UTC) ·
quoted from Gatis A.
In <user-bef4a4d99d1d@xymon.invalid> "Gatis A." <user-e47f4dceddb4@xymon.invalid> writes:
Core was generated by `/home/hob/xymon/server/bin/hobbitfetch
--server=127.0.0.1 --no-daemon --pid'.
Program terminated with signal 6, Aborted.
#0  0x00495402 in __kernel_vsyscall ()
(gdb) backtrace
#0  0x00495402 in __kernel_vsyscall ()
#1  0x00501d10 in raise () from /lib/libc.so.6
#2  0x00503621 in abort () from /lib/libc.so.6
#3  0x08050ff3 in sigsegv_handler (signum=11) at sig.c:57
#4  <signal handler called>
#5  main (argc=5, argv=Cannot access memory at address 0x4
) at hobbitfetch.c:709
Could you try grabbing the current development version
http://hobbitmon.svn.sourceforge.net/viewvc/hobbitmon/trunk/
(from the "Download GNU tarball" link at the bottom).

Run the "configure" script and "make hobbitd-build", then
copy hobbitd/hobbitfetch to your ~hobbit/server/bin/
directory. The crash happens in a part of the code which
was modified extensively since the 4.2.x release, and I
think it has been fixed by that rewrite.


Regards,
Henrik
list Olivier Beau · Wed, 28 Jan 2009 14:31:21 +0100 ·
still have hobbitfetch going red


here's the backtrace :
# gdb ../bin/hobbitfetch core
quoted from Gatis A.
GNU gdb 6.4.90-debian
Copyright (C) 2006 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "i486-linux-gnu"...Using host libthread_db library "/lib/tls/i686/cmov/libthread_db.so.1".


warning: Can't read pathname for load map: Input/output error.
Reading symbols from /usr/lib/libz.so.1...done.
Loaded symbols for /usr/lib/libz.so.1
Reading symbols from /lib/tls/i686/cmov/libc.so.6...done.
Loaded symbols for /lib/tls/i686/cmov/libc.so.6
Reading symbols from /lib/ld-linux.so.2...done.
Loaded symbols for /lib/ld-linux.so.2
Core was generated by `/data/hobbit/server/bin/hobbitfetch --server=127.0.0.1 --no-daemon --pidfile=/d'.
Program terminated with signal 6, Aborted.

#0  0xb7f0b410 in ?? ()
(gdb) bt
#0  0xb7f0b410 in ?? ()
#1  0xbff2b4ec in ?? ()
#2  0x00000006 in ?? ()
#3  0x000068a6 in ?? ()
#4  0xb7de4811 in raise () from /lib/tls/i686/cmov/libc.so.6
#5  0xb7de5fb9 in abort () from /lib/tls/i686/cmov/libc.so.6
#6  0x08051a21 in sigsegv_handler (signum=11) at sig.c:58
#7  0xb7f0b420 in ?? ()
#8  0x0000000b in ?? ()
#9  0x00000033 in ?? ()
#10 0x00000000 in ?? ()
(gdb)


In top, i see that hobbitfetch in at 100% cpu
(i'm only doing pulldata on less that 10 hosts)


   PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND 
18201 hobbit    25   0  2500 1344  524 R  100  0.1   7:59.46 hobbitfetch 
20623 hobbit    15   0  7248 4244 1280 S    0  0.2   0:41.37 hobbitd_rrd 
20816 hobbit    18   0  5088 3108 1300 S    0  0.1   0:14.60 hobbitd_client
     1 root      15   0  1948  652  552 S    0  0.0   0:01.23 init 
     2 root      RT   0     0    0    0 S    0  0.0   0:00.05 migration/0
     3 root      34  19     0    0    0 S    0  0.0   0:00.00 ksoftirqd/0
     4 root      RT   0     0    0    0 S    0  0.0   0:00.02 migration/1
...


Is there a special signal i should send to hobbitfetch ?
or should i just do a kill -9 on it ?..


I dont see anything interesting in hobbitfetch.log, (in debug mode)
anyway since hobbitfetch is in 100% cpu, no logs are appearing now


Olivier
list Henrik Størner · Wed, 28 Jan 2009 21:20:26 +0000 (UTC) ·
quoted from Olivier Beau
In <user-4b113e8798e1@xymon.invalid> Olivier Beau <user-eb340192b6fc@xymon.invalid> writes:
In top, i see that hobbitfetch in at 100% cpu
(i'm only doing pulldata on less that 10 hosts)
Is there a special signal i should send to hobbitfetch ?
or should i just do a kill -9 on it ?..
"kill -6" causes it to dump core, which might be interesting.
It's a different bug from the crash, though (and it's been reported
before, but I cannot seem to get a handle on what triggers it).


Regards,
Henrik
list Olivier Beau · Thu, 29 Jan 2009 08:32:45 +0100 ·
here's the backtrace

# gdb /data/hobbit/server/bin/hobbitfetch core
quoted from Olivier Beau
GNU gdb 6.4.90-debian
Copyright (C) 2006 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain 
conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "i486-linux-gnu"...Using host libthread_db 
library "/lib/tls/i686/cmov/libthread_db.so.1".


warning: Can't read pathname for load map: Input/output error.
Reading symbols from /usr/lib/libz.so.1...done.
Loaded symbols for /usr/lib/libz.so.1
Reading symbols from /lib/tls/i686/cmov/libc.so.6...done.
Loaded symbols for /lib/tls/i686/cmov/libc.so.6
Reading symbols from /lib/ld-linux.so.2...done.
Loaded symbols for /lib/ld-linux.so.2
Core was generated by `/data/hobbit/server/bin/hobbitfetch 
--server=127.0.0.1 --no-daemon --pidfile=/d'.
Program terminated with signal 6, Aborted.

#0  0x0804ae65 in main (argc=6, argv=Cannot access memory at address 0x4
) at hobbitfetch.c:748
748                             switch (connwalk->action) {
(gdb) bt
#0  0x0804ae65 in main (argc=6, argv=Cannot access memory at address 0x4
) at hobbitfetch.c:748
(gdb)
quoted from Henrik Størner


On 28/01/2009 22:20, Henrik Størner wrote:
In <user-4b113e8798e1@xymon.invalid> Olivier Beau <user-eb340192b6fc@xymon.invalid> writes:
In top, i see that hobbitfetch in at 100% cpu
(i'm only doing pulldata on less that 10 hosts)
Is there a special signal i should send to hobbitfetch ?
or should i just do a kill -9 on it ?..
"kill -6" causes it to dump core, which might be interesting.
It's a different bug from the crash, though (and it's been reported
before, but I cannot seem to get a handle on what triggers it).


Regards,
Henrik

list Gatis A. · Tue, 10 Feb 2009 16:57:04 +0200 ·
Is there any suspicion what is the reason for hobbitfetch crash?

This is how often hobbitfetch crashes on my server (I have 16 hosts
with pulldata tag):

Tue Feb 10 16:46:22 2009        xymon-server    hobbitfetch
unknown From -> To      red
Tue Feb 10 16:11:11 2009        xymon-server    hobbitfetch
unknown From -> To      red
Tue Feb 10 16:11:11 2009        xymon-server    hobbitfetch
unknown From -> To      red
Tue Feb 10 16:00:35 2009        xymon-server    hobbitfetch
unknown From -> To      red
Tue Feb 10 15:20:08 2009        xymon-server    hobbitfetch
unknown From -> To      red
Tue Feb 10 14:15:57 2009        xymon-server    hobbitfetch
unknown From -> To      red
Tue Feb 10 13:51:04 2009        xymon-server    hobbitfetch
unknown From -> To      red
Tue Feb 10 13:45:50 2009        xymon-server    hobbitfetch
unknown From -> To      red
Tue Feb 10 13:34:59 2009        xymon-server    hobbitfetch
unknown From -> To      red
Tue Feb 10 13:24:58 2009        xymon-server    hobbitfetch
unknown From -> To      red
Tue Feb 10 13:24:58 2009        xymon-server    hobbitfetch
unknown From -> To      red
Tue Feb 10 13:23:47 2009        xymon-server    hobbitfetch
unknown From -> To      red
Tue Feb 10 13:13:45 2009        xymon-server    hobbitfetch
unknown From -> To      red
Tue Feb 10 13:08:59 2009        xymon-server    hobbitfetch
unknown From -> To      red
Tue Feb 10 11:03:22 2009        xymon-server    hobbitfetch
unknown From -> To      red

just another backtrace:

Core was generated by `/home/hob/xymon/server/bin/hobbitfetch
--server=xxx.xxx.xxx.xxx --no-daemon --pid'.
Program terminated with signal 6, Aborted.
#0  0x0095b402 in __kernel_vsyscall ()
(gdb) backtrace
#0  0x0095b402 in __kernel_vsyscall ()
quoted from Henrik Størner
#1  0x00501d10 in raise () from /lib/libc.so.6
#2  0x00503621 in abort () from /lib/libc.so.6
#3  0x08050ff3 in sigsegv_handler (signum=11) at sig.c:57
#4  <signal handler called>

#5  main (argc=4, argv=Cannot access memory at address 0x4
quoted from Olivier Beau
) at hobbitfetch.c:709
(gdb)


On Thu, Jan 29, 2009 at 9:32 AM, Olivier Beau <user-eb340192b6fc@xymon.invalid> wrote:
here's the backtrace

# gdb /data/hobbit/server/bin/hobbitfetch core
GNU gdb 6.4.90-debian
Copyright (C) 2006 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "i486-linux-gnu"...Using host libthread_db library "/lib/tls/i686/cmov/libthread_db.so.1".


warning: Can't read pathname for load map: Input/output error.
Reading symbols from /usr/lib/libz.so.1...done.
Loaded symbols for /usr/lib/libz.so.1
Reading symbols from /lib/tls/i686/cmov/libc.so.6...done.
Loaded symbols for /lib/tls/i686/cmov/libc.so.6
Reading symbols from /lib/ld-linux.so.2...done.
Loaded symbols for /lib/ld-linux.so.2
Core was generated by `/data/hobbit/server/bin/hobbitfetch --server=127.0.0.1 --no-daemon --pidfile=/d'.
Program terminated with signal 6, Aborted.
#0  0x0804ae65 in main (argc=6, argv=Cannot access memory at address 0x4
) at hobbitfetch.c:748
748                             switch (connwalk->action) {
(gdb) bt
#0  0x0804ae65 in main (argc=6, argv=Cannot access memory at address 0x4
) at hobbitfetch.c:748
(gdb)


On 28/01/2009 22:20, Henrik Størner wrote:
In <user-4b113e8798e1@xymon.invalid> Olivier Beau <user-eb340192b6fc@xymon.invalid> writes:
In top, i see that hobbitfetch in at 100% cpu
(i'm only doing pulldata on less that 10 hosts)
Is there a special signal i should send to hobbitfetch ?
or should i just do a kill -9 on it ?..
"kill -6" causes it to dump core, which might be interesting.
It's a different bug from the crash, though (and it's been reported
before, but I cannot seem to get a handle on what triggers it).


Regards,
Henrik