Xymon Mailing List Archive search

Hobbitd_history crashing

8 messages in this thread

list David Stuffle · Tue, 14 Jun 2005 10:59:31 -0500 ·
Hobbit 4.0.4
Fedora Core 3

My hobbitd_history has crashed several times in the past few weeks.  It goes
red with "Fatal signal caught" then it goes purple.  I have core files in
the server/tmp directory.  Here is the debug output you have requested in a
previous post.  Thanks!

-bash-3.00$ gdb bin/hobbitd_history tmp/core.26954
GNU gdb Red Hat Linux (6.1post-1.20040607.41rh)
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain
conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "i386-redhat-linux-gnu"...Using host libthread_db
library "/lib/tls/libthread_db.so.1".

Core was generated by `hobbitd_history'.
Program terminated with signal 6, Aborted.
Reading symbols from /lib/tls/libc.so.6...done.
Loaded symbols for /lib/tls/libc.so.6
Reading symbols from /lib/ld-linux.so.2...done.
Loaded symbols for /lib/ld-linux.so.2
#0  0x003ff7a2 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2
(gdb) bt
#0  0x003ff7a2 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2
#1  0x0043f955 in raise () from /lib/tls/libc.so.6
#2  0x00441319 in abort () from /lib/tls/libc.so.6
#3  0x0804d5de in sigsegv_handler (signum=11) at sig.c:57
#4  <signal handler called>
#5  0x0804a242 in main (argc=1, argv=0xfef4a724) at hobbitd_history.c:194
(gdb) 


~~~~~~~~~~~~~~
David Stuffle                       user-4d88f4a4f51e@xymon.invalid       
Delta Faucet Company                (XXX) XXX-XXXX


This email and any files transmitted with it are confidential and intended
solely for the use of the individual or entity to whom they are addressed.
If you have received this email in error please notify the system manager.
Please note that any views or opinions presented in this email are solely
those of the author and do not necessarily represent those of the company.
Finally, the recipient should check this email and any attachments for the
presence of viruses. The company accepts no liability for any damage caused
by any virus transmitted by this email.
list Henrik Størner · Tue, 14 Jun 2005 22:18:14 +0200 ·
quoted from David Stuffle
On Tue, Jun 14, 2005 at 10:59:31AM -0500, Stuffle, David wrote:
Hobbit 4.0.4
Fedora Core 3

My hobbitd_history has crashed several times in the past few weeks.  It goes
red with "Fatal signal caught" then it goes purple.  I have core files in
the server/tmp directory.  Here is the debug output you have requested in a
previous post.  Thanks!

#3  0x0804d5de in sigsegv_handler (signum=11) at sig.c:57
#4  <signal handler called>
#5  0x0804a242 in main (argc=1, argv=0xfef4a724) at hobbitd_history.c:194
I guess this can happen if a "disable" message does not provide any
cause, i.e. "disable foo.test 180"

Does this patch fix it for you ?


Regards,
Henrik


-------------- next part --------------
--- hobbitd/hobbitd_history.c	2005/04/25 12:39:51	1.36
+++ hobbitd/hobbitd_history.c	2005/06/14 20:17:03
@@ -13,7 +13,7 @@
 /*                                                                            */
 /*----------------------------------------------------------------------------*/
 
-static char rcsid[] = "$Id: hobbitd_history.c,v 1.36 2005/04/25 12:39:51 henrik Exp $";
+static char rcsid[] = "$Id: hobbitd_history.c,v 1.37 2005/06/14 20:16:59 henrik Exp $";
 
 #include <sys/types.h>
 #include <stdio.h>
@@ -158,7 +158,7 @@
 			oldcolor = parse_color(items[8]);
 			lastchg  = atoi(items[9]);
 			disabletime = atoi(items[10]);
-			dismsg   = items[11];
+			dismsg   = items[11]; if (!dismsg) dismsg = "(No reason given)";
 
 			if (save_histlogs) {
 				char *hostdash;
list Sladewig · Fri, 17 Jun 2005 06:56:47 -0500 ·
Greetings!
One of our windows machines had raid setup rearranged and now there is 
one less drive. We are now getting alerts "red H drive not found". How 
can this one drive be removed ?

thanks,
steve
list Gordon Thiesfeld · Fri, 17 Jun 2005 09:23:00 -0500 ·
Check the client on the windows machine.  Make sure there is no setting for
drive H in the drives list.
quoted from Sladewig
-----Original Message-----
From: sladewig [mailto:user-25b160a6ee31@xymon.invalid]
Sent: Friday, June 17, 2005 6:57 AM
To: user-ae9b8668bcde@xymon.invalid
Subject: [hobbit] How to drop a disk drive

Greetings!
One of our windows machines had raid setup rearranged and now there is
one less drive. We are now getting alerts "red H drive not found". How
can this one drive be removed ?

thanks,
steve

list Sladewig · Fri, 17 Jun 2005 09:24:45 -0500 ·
quoted from Sladewig
On 06/17/2005 06:56 AM, sladewig wrote:
Greetings!
One of our windows machines had raid setup rearranged and now there is one less drive. We are now getting alerts "red H drive not found". How can this one drive be removed ?
Aha! Figured it out myself. The client had the warning threshold set for   the H: drive raised beyond normal limits and them was reporting the drive not found.
list David Stuffle · Mon, 20 Jun 2005 07:27:14 -0500 ·
quoted from Henrik Størner
user-ce4a2c883f75@xymon.invalid wrote:
On Tue, Jun 14, 2005 at 10:59:31AM -0500, Stuffle, David
wrote: 
Hobbit 4.0.4
Fedora Core 3

My hobbitd_history has crashed several times in the past
few weeks. It goes red with "Fatal signal caught" then
it goes purple.  I have core files in the server/tmp
directory.  Here is the debug output you have requested
in a previous post.  Thanks! 

#3  0x0804d5de in sigsegv_handler (signum=11) at
sig.c:57 #4  <signal handler called> #5  0x0804a242 in
main (argc=1, argv=0xfef4a724) at hobbitd_history.c:194
I guess this can happen if a "disable" message does not
provide any cause, i.e. "disable foo.test 180"

Does this patch fix it for you ?


Regards,
Henrik
Henrik,

After applying the patch, I am still getting the crashes.  Here is the core
file output. Thanks

-bash-3.00$ gdb bin/hobbitd_history tmp/core.27019
quoted from David Stuffle
GNU gdb Red Hat Linux (6.1post-1.20040607.41rh)
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain
conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "i386-redhat-linux-gnu"...Using host libthread_db
library "/lib/tls/libthread_db.so.1".

Core was generated by `hobbitd_history'.
Program terminated with signal 6, Aborted.
Reading symbols from /lib/tls/libc.so.6...done.
Loaded symbols for /lib/tls/libc.so.6
Reading symbols from /lib/ld-linux.so.2...done.
Loaded symbols for /lib/ld-linux.so.2
#0  0x003ff7a2 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2
(gdb) bt
#0  0x003ff7a2 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2
#1  0x0043f955 in raise () from /lib/tls/libc.so.6
#2  0x00441319 in abort () from /lib/tls/libc.so.6

#3  0x0804d5f6 in sigsegv_handler (signum=11) at sig.c:57
#4  <signal handler called>
#5  nldecode (msg=0x804e926 "(No reason given)") at encoding.c:250
#6  0x0804a9ac in main (argc=1, argv=0xfeec01b4) at hobbitd_history.c:195
quoted from David Stuffle
(gdb) 

~~~~~~~~~~~~~~
David Stuffle                       user-4d88f4a4f51e@xymon.invalid
Delta Faucet Company                (XXX) XXX-XXXX


This email and any files transmitted with it are confidential and intended
solely for the use of the individual or entity to whom they are addressed.
If you have received this email in error please notify the system manager.
Please note that any views or opinions presented in this email are solely
those of the author and do not necessarily represent those of the company.
Finally, the recipient should check this email and any attachments for the
presence of viruses. The company accepts no liability for any damage caused
by any virus transmitted by this email.
list David Stuffle · Mon, 20 Jun 2005 10:42:45 -0500 ·
quoted from David Stuffle
Stuffle, David wrote:
user-ce4a2c883f75@xymon.invalid wrote:
On Tue, Jun 14, 2005 at 10:59:31AM -0500, Stuffle, David
wrote:
Hobbit 4.0.4
Fedora Core 3

My hobbitd_history has crashed several times in the past
few weeks. It goes red with "Fatal signal caught" then
it goes purple.  I have core files in the server/tmp
directory.  Here is the debug output you have requested
in a previous post.  Thanks! 

#3  0x0804d5de in sigsegv_handler (signum=11) at
sig.c:57 #4  <signal handler called> #5  0x0804a242 in
main (argc=1, argv=0xfef4a724) at hobbitd_history.c:194
I guess this can happen if a "disable" message does not
provide any cause, i.e. "disable foo.test 180"

Does this patch fix it for you ?


Regards,
Henrik
Henrik,

After applying the patch, I am still getting the crashes.
Here is the core file output. Thanks

-bash-3.00$ gdb bin/hobbitd_history tmp/core.27019
GNU gdb Red Hat Linux (6.1post-1.20040607.41rh)
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public
License, and you are welcome to change it and/or
distribute copies of it under certain conditions. Type
"show copying" to see the conditions. There is absolutely
no warranty for GDB. Type "show warranty" for details.
This GDB was configured as
"i386-redhat-linux-gnu"...Using host libthread_db library
"/lib/tls/libthread_db.so.1".  

Core was generated by `hobbitd_history'.
Program terminated with signal 6, Aborted.
Reading symbols from /lib/tls/libc.so.6...done.
Loaded symbols for /lib/tls/libc.so.6
Reading symbols from /lib/ld-linux.so.2...done.
Loaded symbols for /lib/ld-linux.so.2
#0  0x003ff7a2 in _dl_sysinfo_int80 () from
/lib/ld-linux.so.2 (gdb) bt #0  0x003ff7a2 in
_dl_sysinfo_int80 () from /lib/ld-linux.so.2 #1 
0x0043f955 in raise () from /lib/tls/libc.so.6 #2 
0x00441319 in abort () from /lib/tls/libc.so.6 #3 
0x0804d5f6 in sigsegv_handler (signum=11) at sig.c:57 #4 
<signal handler called> #5 nldecode (msg=0x804e926 "(No
reason given)") at encoding.c:250 #6  0x0804a9ac in main
(argc=1, argv=0xfeec01b4) at hobbitd_history.c:195
(gdb)

~~~~~~~~~~~~~~
David Stuffle                      
user-4d88f4a4f51e@xymon.invalid Delta Faucet Company            
Just found out that the person disabling didn't put a duration or a reason.
I am having him change the scripts.

~~~~~~~~~~~~~~
David Stuffle               
Delta Faucet Company                
quoted from David Stuffle


This email and any files transmitted with it are confidential and intended
solely for the use of the individual or entity to whom they are addressed.
If you have received this email in error please notify the system manager.
Please note that any views or opinions presented in this email are solely
those of the author and do not necessarily represent those of the company.
Finally, the recipient should check this email and any attachments for the
presence of viruses. The company accepts no liability for any damage caused
by any virus transmitted by this email.
list Henrik Størner · Thu, 23 Jun 2005 13:45:20 +0200 ·
quoted from David Stuffle
On Mon, Jun 20, 2005 at 10:42:45AM -0500, Stuffle, David wrote:
Stuffle, David wrote:
0x00441319 in abort () from /lib/tls/libc.so.6 #3 
0x0804d5f6 in sigsegv_handler (signum=11) at sig.c:57 #4 
<signal handler called> #5 nldecode (msg=0x804e926 "(No
reason given)") at encoding.c:250 #6  0x0804a9ac in main
(argc=1, argv=0xfeec01b4) at hobbitd_history.c:195
(gdb)
Just found out that the person disabling didn't put a duration or a reason.
I am having him change the scripts.
Still, Hobbit shouldn't crash because of it. I've found out what the
problem was, and it will be fixed in 4.0.5.


Henrik