Xymon Mailing List Archive search

hobbitd status-board not available

list David Gore
Tue, 11 Oct 2005 16:56:32 +0000
Message-Id: <user-cf47f7cbd727@xymon.invalid>

Henrik Stoerner wrote:
On Sat, Oct 08, 2005 at 04:08:57PM -0600, David Gore wrote:
  
What does this message mean.  Typically we get this when disabling 
multiple hosts.  Is it a host resource issue, something isn't replying 
quick enough?  We are on the snapshot from 03 October.  This has been 
happening over many weeks and different snapshots.  OS is solaris 9.
    
It really points to a bug in the hobbitd daemon - it means that some
task (usually bbdisplay) couldn't fetch the status information from
the Hobbit server, which it uses to build the webpages.

I'm somewhat alarmed if you have this problem with such a recent 
snapshot. I know there was a bug in 4.1.1 (and earlier) that could 
trigger this when disabling or renaming hosts, but that should not
happen with the snapshot from 03 Oct.

  
I am pretty sure these happen as people disable hosts and it fails 
although bb2.html shows them going to blue in the history, they will not 
show up on the enable/disable screen and usually show as failed when 
executing the disable.
    
Interesting. I'll go over that particular piece of code again to
see if I can come up with an explanation. If you have a way of
triggering this, let me know - in that case, I'd like you to try out
some things to make it sure it is fixed.


Regards,
Henrik

It is still happening with the latest 4.1.2 install.  A multi-host (~75+ 
hosts) disable worked, but then later on the enable it looks like 
hobbitd crashed:

hobbit at hobbit:/export/home/hobbit/server> find . -name core
./tmp/core
hobbit at hobbit:/export/home/hobbit/server> ls -al ./tmp/core
-rw-------   1 hobbit   other    13630500 Oct 11 16:46 ./tmp/core
hobbit at hobbit:/export/home/hobbit/server> file ./tmp/core
./tmp/core:     ELF 32-bit MSB core file SPARC Version 1, from 'hobbitd'
hobbit at hobbit:/export/home/hobbit/server> gdb bin/hobbitd tmp/core
GNU gdb 6.0
Copyright 2003 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain 
conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "sparc-sun-solaris2.9"...
Core was generated by `hobbitd 
--pidfile=/export/home/hobbit/server/logs/hobbitd.pid --restart=/export'.
Program terminated with signal 6, Aborted.
Reading symbols from /usr/lib/libresolv.so.2...done.
Loaded symbols for /usr/lib/libresolv.so.2
Reading symbols from /usr/lib/libsocket.so.1...done.
Loaded symbols for /usr/lib/libsocket.so.1
Reading symbols from /usr/lib/libnsl.so.1...done.
Loaded symbols for /usr/lib/libnsl.so.1
Reading symbols from /usr/lib/libc.so.1...done.
Loaded symbols for /usr/lib/libc.so.1
Reading symbols from /usr/lib/libdl.so.1...done.
Loaded symbols for /usr/lib/libdl.so.1
Reading symbols from /usr/lib/libmp.so.2...done.
Loaded symbols for /usr/lib/libmp.so.2
Reading symbols from /usr/platform/SUNW,Ultra-60/lib/libc_psr.so.1...done.
Loaded symbols for /usr/platform/SUNW,Ultra-60/lib/libc_psr.so.1
#0  0xff19fff8 in _libc_kill () from /usr/lib/libc.so.1
(gdb) bt
#0  0xff19fff8 in _libc_kill () from /usr/lib/libc.so.1
#1  0xff136cd8 in abort () from /usr/lib/libc.so.1
#2  0x00021080 in sigsegv_handler (signum=10) at sig.c:57
#3  <signal handler called>
(gdb)

Can you give me directions on how I can do a relatively clean install 
and still retain all my historical information?

~David