Xymon Mailing List Archive search

Hobbit core dump every 5 minutes

5 messages in this thread

list Brian Daly · Tue, 18 Jan 2011 18:42:51 +0000 ·
Hi Guys,

today my hobbit server started throwing up purple alerts for all my network (conn) and various other tcp connection tests. I noticed that a core file was being created every 5 minutes. Here is what gdb has to say - 


[hobbit at monitor server]$ gdb bin/hobbitd core.4115 
GNU gdb (GDB) Red Hat Enterprise Linux (7.0.1-23.el5_5.2)
Copyright (C) 2009 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>;
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "i386-redhat-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>;...
Reading symbols from /usr/local/hobbit/server/bin/hobbitd...done.

warning: core file may not match specified executable file.
Reading symbols from /lib/ld-linux.so.2...(no debugging symbols found)...done.
Loaded symbols for /lib/ld-linux.so.2
Core was generated by `bbtest-net --report --ping --checkresponse'.
Program terminated with signal 6, Aborted.
#0  0x00e3d402 in __kernel_vsyscall ()
(gdb) bt
#0  0x00e3d402 in __kernel_vsyscall ()
#1  0x005f2040 in ?? ()
#2  0x00720ff4 in ?? ()
#3  0xb7f356d0 in ?? ()
#4  0xbf8057a8 in ?? ()
#5  0x005f3a21 in ?? ()
#6  0x00000006 in ?? ()
#7  0xbf80571c in ?? ()
#8  0x00000000 in ?? ()


This is the latest output from the bb-network.log  file - 

xstrdup: Cannot dup NULL string


not sure how to find out what version of Hobbit I'm running.

Any help would be greatly appreciated.

Regards

Brian Daly.
list David Baldwin · Wed, 19 Jan 2011 09:09:45 +1100 ·
quoted from Brian Daly
On 19/01/11 5:42 AM, Brian Daly wrote:
Hi Guys,

today my hobbit server started throwing up purple alerts for all my
network (conn) and various other tcp connection tests. I noticed that
a core file was being created every 5 minutes. Here is what gdb has to
say - 


[hobbit at monitor server]$ gdb bin/hobbitd core.4115 
GNU gdb (GDB) Red Hat Enterprise Linux (7.0.1-23.el5_5.2)
Copyright (C) 2009 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later
<http://gnu.org/licenses/gpl.html>;
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "i386-redhat-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>;...
Reading symbols from /usr/local/hobbit/server/bin/hobbitd...done.

warning: core file may not match specified executable file.
Reading symbols from /lib/ld-linux.so.2...(no debugging symbols
found)...done.
Loaded symbols for /lib/ld-linux.so.2
Core was generated by `bbtest-net --report --ping --checkresponse'.
Program terminated with signal 6, Aborted.
#0  0x00e3d402 in __kernel_vsyscall ()
(gdb) bt
#0  0x00e3d402 in __kernel_vsyscall ()
#1  0x005f2040 in ?? ()
#2  0x00720ff4 in ?? ()
#3  0xb7f356d0 in ?? ()
#4  0xbf8057a8 in ?? ()
#5  0x005f3a21 in ?? ()
#6  0x00000006 in ?? ()
#7  0xbf80571c in ?? ()
#8  0x00000000 in ?? ()


This is the latest output from the bb-network.log  file - 

xstrdup: Cannot dup NULL string


not sure how to find out what version of Hobbit I'm running.
To show the version:

$ bbcmd bbtest-net --version
2011-01-19 09:03:48 Using default environment file
/usr/lib/hobbit/server/etc/hobbitserver.cfg
bbtest-net version 4.2.3
SSL library : OpenSSL 0.9.8e-rhel5 01 Jul 2008
LDAP library: OpenLDAP 20343

Try running the gdb again against the bbtest-net binary at
/usr/local/hobbit/server/bin/bbtest-net on your system.

David.

-- 
David Baldwin - IT Unit
Australian Sports Commission          www.ausport.gov.au
Tel 02 62147830 Fax 02 62141830       PO Box 176 Belconnen ACT 2616
user-cbbf693f2c89@xymon.invalid          Leverrier Street Bruce ACT 2617


Keep up to date with what's happening in Australian sport visit http://www.ausport.gov.au

This message is intended for the addressee named and may contain confidential and privileged information. If you are not the intended recipient please note that any form of distribution, copying or use of this communication or the information in it is strictly prohibited and may be unlawful. If you receive this message in error, please delete it and notify the sender.
list Henrik Størner · Tue, 18 Jan 2011 23:05:53 +0000 (UTC) ·
quoted from Brian Daly
In <user-37e92795944a@xymon.invalid> Brian Daly <user-afc856918890@xymon.invalid> writes:
today my hobbit server started throwing up purple alerts for all my =
network (conn) and various other tcp connection tests. I noticed that a =
core file was being created every 5 minutes. Here is what gdb has to say =
[hobbit at monitor server]$ gdb bin/hobbitd core.4115
You're feeding gdb the wrong program. It says
Core was generated by `bbtest-net --report --ping --checkresponse'.
So instead of "bin/hobbitd", use "bin/bbtest-net". That should 
(hopefully) give us a much more usable gdb output.
quoted from David Baldwin
This is the latest output from the bb-network.log  file -
xstrdup: Cannot dup NULL string
Hmm, pretty generic that one. Anything else in that log?

You can run bbtest-net by hand - with
  $ bbcmd bbtest-net HOSTNAME
where HOSTNAME is the name of one of your hosts. It would be
interesting to know if it crashes if you test only one host - 
that would point to the problem being in global configuration.
If it runs fine with one host, then the problem is probably
in the configuration of one particular host in bb-hosts.

I suspect it is a mis-configuration of one network test, there 
have been some problems with error handling if network tests
(especially URL's in complex web- or ldap-checks) were not
specified correctly.

not sure how to find out what version of Hobbit I'm running.
bin/bbtest-net --version


Regards,
Henrik
list Brian Daly · Tue, 18 Jan 2011 23:14:21 +0000 ·
Hi David,

here is the results of gdb agains bbtest-net

gdb bin/bbtest-net core.9941 
quoted from David Baldwin
GNU gdb (GDB) Red Hat Enterprise Linux (7.0.1-23.el5_5.2)
Copyright (C) 2009 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>;
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "i386-redhat-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>;...

Reading symbols from /usr/local/hobbit/server/bin/bbtest-net...done.

warning: .dynamic section for "/lib/libnss_files.so.2" is not at the expected address

warning: difference appears to be caused by prelink, adjusting expectations
Reading symbols from /usr/lib/libldap-2.3.so.0...(no debugging symbols found)...done.
Loaded symbols for /usr/lib/libldap-2.3.so.0
Reading symbols from /usr/lib/liblber-2.3.so.0...(no debugging symbols found)...done.
Loaded symbols for /usr/lib/liblber-2.3.so.0
Reading symbols from /lib/libssl.so.6...(no debugging symbols found)...done.
Loaded symbols for /lib/libssl.so.6
Reading symbols from /lib/libcrypto.so.6...(no debugging symbols found)...done.
Loaded symbols for /lib/libcrypto.so.6
Reading symbols from /lib/i686/nosegneg/libc.so.6...(no debugging symbols found)...done.
Loaded symbols for /lib/i686/nosegneg/libc.so.6
Reading symbols from /lib/libresolv.so.2...(no debugging symbols found)...done.
Loaded symbols for /lib/libresolv.so.2
Reading symbols from /usr/lib/libsasl2.so.2...(no debugging symbols found)...done.
Loaded symbols for /usr/lib/libsasl2.so.2
Reading symbols from /usr/lib/libgssapi_krb5.so.2...(no debugging symbols found)...done.
Loaded symbols for /usr/lib/libgssapi_krb5.so.2
Reading symbols from /usr/lib/libkrb5.so.3...(no debugging symbols found)...done.
Loaded symbols for /usr/lib/libkrb5.so.3
Reading symbols from /lib/libcom_err.so.2...(no debugging symbols found)...done.
Loaded symbols for /lib/libcom_err.so.2
Reading symbols from /usr/lib/libk5crypto.so.3...(no debugging symbols found)...done.
Loaded symbols for /usr/lib/libk5crypto.so.3
Reading symbols from /lib/libdl.so.2...(no debugging symbols found)...done.
Loaded symbols for /lib/libdl.so.2
Reading symbols from /usr/lib/libz.so.1...(no debugging symbols found)...done.
Loaded symbols for /usr/lib/libz.so.1
Reading symbols from /lib/ld-linux.so.2...(no debugging symbols found)...done.
Loaded symbols for /lib/ld-linux.so.2
Reading symbols from /lib/libcrypt.so.1...(no debugging symbols found)...done.
Loaded symbols for /lib/libcrypt.so.1
Reading symbols from /usr/lib/libkrb5support.so.0...(no debugging symbols found)...done.
Loaded symbols for /usr/lib/libkrb5support.so.0
Reading symbols from /lib/libkeyutils.so.1...(no debugging symbols found)...done.
Loaded symbols for /lib/libkeyutils.so.1
Reading symbols from /lib/libselinux.so.1...(no debugging symbols found)...done.
Loaded symbols for /lib/libselinux.so.1
Reading symbols from /lib/libsepol.so.1...(no debugging symbols found)...done.
Loaded symbols for /lib/libsepol.so.1
Reading symbols from /lib/libnss_files.so.2...(no debugging symbols found)...done.
Loaded symbols for /lib/libnss_files.so.2
Reading symbols from /lib/libnss_dns.so.2...(no debugging symbols found)...done.
Loaded symbols for /lib/libnss_dns.so.2
Core was generated by `bbtest-net --report --ping --checkresponse --no-ares'.
Program terminated with signal 6, Aborted.
#0  0x00ea0402 in __kernel_vsyscall ()
(gdb) bt
#0  0x00ea0402 in __kernel_vsyscall ()
#1  0x005f2040 in raise () from /lib/i686/nosegneg/libc.so.6
#2  0x005f3a21 in abort () from /lib/i686/nosegneg/libc.so.6
#3  0x0805d70c in xstrdup (s=0x0) at memory.c:169
#4  0x08051da0 in setup_ssl (item=0x918d8b8) at contest.c:626
#5  0x080531e9 in do_tcp_tests (timeout=10, concurrency=246) at contest.c:998
#6  0x080502b7 in main (argc=5, argv=0xbf980a84) at bbtest-net.c:2284


************

[root at monitor bin]# ./bbcmd bbtest-net --version
2011-01-18 23:13:47 Using default environment file /usr/local/hobbit/server/etc/hobbitserver.cfg
bbtest-net version 4.2.0
SSL library : OpenSSL 0.9.8b 04 May 2006
LDAP library: OpenLDAP 20327


thanks

Brian
quoted from David Baldwin


On 18 Jan 2011, at 22:09, David Baldwin wrote:
On 19/01/11 5:42 AM, Brian Daly wrote:
Hi Guys,

today my hobbit server started throwing up purple alerts for all my
network (conn) and various other tcp connection tests. I noticed that
a core file was being created every 5 minutes. Here is what gdb has to
say - 


[hobbit at monitor server]$ gdb bin/hobbitd core.4115 
GNU gdb (GDB) Red Hat Enterprise Linux (7.0.1-23.el5_5.2)
Copyright (C) 2009 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later
<http://gnu.org/licenses/gpl.html>;
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "i386-redhat-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>;...
Reading symbols from /usr/local/hobbit/server/bin/hobbitd...done.

warning: core file may not match specified executable file.
Reading symbols from /lib/ld-linux.so.2...(no debugging symbols
found)...done.
Loaded symbols for /lib/ld-linux.so.2
Core was generated by `bbtest-net --report --ping --checkresponse'.
Program terminated with signal 6, Aborted.
#0  0x00e3d402 in __kernel_vsyscall ()
(gdb) bt
#0  0x00e3d402 in __kernel_vsyscall ()
#1  0x005f2040 in ?? ()
#2  0x00720ff4 in ?? ()
#3  0xb7f356d0 in ?? ()
#4  0xbf8057a8 in ?? ()
#5  0x005f3a21 in ?? ()
#6  0x00000006 in ?? ()
#7  0xbf80571c in ?? ()
#8  0x00000000 in ?? ()


This is the latest output from the bb-network.log  file - 

xstrdup: Cannot dup NULL string


not sure how to find out what version of Hobbit I'm running.
To show the version:

$ bbcmd bbtest-net --version
2011-01-19 09:03:48 Using default environment file
/usr/lib/hobbit/server/etc/hobbitserver.cfg
bbtest-net version 4.2.3
SSL library : OpenSSL 0.9.8e-rhel5 01 Jul 2008
LDAP library: OpenLDAP 20343

Try running the gdb again against the bbtest-net binary at
/usr/local/hobbit/server/bin/bbtest-net on your system.

David.

-- 
David Baldwin - IT Unit
Australian Sports Commission          www.ausport.gov.au
Tel 02 62147830 Fax 02 62141830       PO Box 176 Belconnen ACT 2616
user-cbbf693f2c89@xymon.invalid          Leverrier Street Bruce ACT 2617


Keep up to date with what's happening in Australian sport visit http://www.ausport.gov.au

This message is intended for the addressee named and may contain confidential and privileged information. If you are not the intended recipient please note that any form of distribution, copying or use of this communication or the information in it is strictly prohibited and may be unlawful. If you receive this message in error, please delete it and notify the sender.

Brian Daly
Unix Administrator


Critical Path, Inc.
42-47, Lower Mount St.,
Dublin 2,
Ireland

Phone:	+XXX X XXX XXXX
Fax:	+XXX X XXX XXXX
www.criticalpath.net
Attachments (1)
list Henrik Størner · Wed, 19 Jan 2011 06:45:52 +0000 (UTC) ·
quoted from Brian Daly
In <user-1bb18844e831@xymon.invalid> Brian Daly <user-afc856918890@xymon.invalid> writes:
here is the results of gdb agains bbtest-net
gdb bin/bbtest-net core.9941
Program terminated with signal 6, Aborted.
#0  0x00ea0402 in __kernel_vsyscall ()
(gdb) bt
#0  0x00ea0402 in __kernel_vsyscall ()
#1  0x005f2040 in raise () from /lib/i686/nosegneg/libc.so.6
#2  0x005f3a21 in abort () from /lib/i686/nosegneg/libc.so.6
#3  0x0805d70c in xstrdup (s=0x0) at memory.c:169
#4  0x08051da0 in setup_ssl (item=0x918d8b8) at contest.c:626
#5  0x080531e9 in do_tcp_tests (timeout=10, concurrency=246) at contest.c:998
#6  0x080502b7 in main (argc=5, argv=0xbf980a84) at bbtest-net.c:2284
[root at monitor bin]# ./bbcmd bbtest-net --version
bbtest-net version 4.2.0
SSL library : OpenSSL 0.9.8b 04 May 2006
LDAP library: OpenLDAP 20327
4.2.0 is a pretty old version.

Based on the traceback, I would think this is the
"bbtest-net segfaults on SSL cert expire 2054" bug 
reported by the Debian folks in 
http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=503111

This bugreport also includes a patch that you can use if you must
remain at version 4.2.0. But I would really recommend that you
update to at least 4.2.3, which is the last 4.2.x release.


Regards,
Henrik