Xymon Mailing List Archive search

hobbitd_alert - program crash

4 messages in this thread

list Thomas Pedersen · Tue, 08 Nov 2005 20:21:12 +0100 ·
Hi All,

Just configured some more alerts in my hobbit and I have a hobbid_alert 
program crash. I am runnning routermon and today I have tried to exclude 
if_stats from sending a SMS during the night. I have configured 
hobbit-alerts.cfg as following

PAGE=net-nodes/routers
  IGNORE HOST=$ROUTERS TIME=*:1600:2359 SERVICE=if_stat
  IGNORE HOST=$ROUTERS TIME=*:0000:0800 SERVICE=if_stat
  MAIL admin at localhost

Gdb gives the following

-bash-2.05b$ gdb bin/hobbitd_alert tmp/core.26618
GNU gdb Red Hat Linux (6.1post-1.20040607.52rh)
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain 
conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "i386-redhat-linux-gnu"...Using host 
libthread_db library "/lib/tls/libthread_db.so.1".

Core was generated by `hobbitd_alert 
--checkpoint-file=/hobbit/hobbit/server/tmp/alert.chk --checkpoin'.
Program terminated with signal 6, Aborted.
Reading symbols from /lib/libpcre.so.0...done.
Loaded symbols for /lib/libpcre.so.0
Reading symbols from /lib/tls/libc.so.6...done.
Loaded symbols for /lib/tls/libc.so.6
Reading symbols from /lib/ld-linux.so.2...done.
Loaded symbols for /lib/ld-linux.so.2
#0  0x00658cef in raise () from /lib/tls/libc.so.6
(gdb) bt
#0  0x00658cef in raise () from /lib/tls/libc.so.6
#1  0x0065a4f5 in abort () from /lib/tls/libc.so.6
#2  0x08055aee in sigsegv_handler (signum=11) at sig.c:57
#3  <signal handler called>
#4  0x0804da0b in find_repeatinfo (alert=0x9f94fe0, recip=0x9f945d0, 
create=0) at do_alert.c:1054
#5  0x0804ee44 in clear_interval (alert=0x9f94fe0) at do_alert.c:1602
#6  0x0804ab03 in main (argc=167333856, argv=0x0) at hobbitd_alert.c:495

page.log gives:
2005-11-08 19:55:55 Worker process died with exit code 134, terminating

hobbitlaunch.log gives:
2005-11-08 19:55:55 Task bbpage terminated, status 1

Any clue as to what caused it.

Regards,  Thomas
list Thomas Pedersen · Wed, 09 Nov 2005 08:45:36 +0100 ·
Hi All,

Just configured some more alerts in my hobbit and I have a hobbid_alert
program crash. I am runnning routermon and today I have tried to exclude
if_stats from sending a SMS during the night. I have configured
hobbit-alerts.cfg as following

PAGE=net-nodes/routers
  IGNORE HOST=$ROUTERS TIME=*:1600:2359 SERVICE=if_stat
  IGNORE HOST=$ROUTERS TIME=*:0000:0800 SERVICE=if_stat
  MAIL admin at localhost

Gdb gives the following

-bash-2.05b$ gdb bin/hobbitd_alert tmp/core.26618
GNU gdb Red Hat Linux (6.1post-1.20040607.52rh)
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain
conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "i386-redhat-linux-gnu"...Using host
libthread_db library "/lib/tls/libthread_db.so.1".

Core was generated by `hobbitd_alert
--checkpoint-file=/hobbit/hobbit/server/tmp/alert.chk --checkpoin'.
Program terminated with signal 6, Aborted.
Reading symbols from /lib/libpcre.so.0...done.
Loaded symbols for /lib/libpcre.so.0
Reading symbols from /lib/tls/libc.so.6...done.
Loaded symbols for /lib/tls/libc.so.6
Reading symbols from /lib/ld-linux.so.2...done.
Loaded symbols for /lib/ld-linux.so.2
#0  0x00658cef in raise () from /lib/tls/libc.so.6
(gdb) bt
#0  0x00658cef in raise () from /lib/tls/libc.so.6
#1  0x0065a4f5 in abort () from /lib/tls/libc.so.6
#2  0x08055aee in sigsegv_handler (signum=11) at sig.c:57
#3  <signal handler called>
#4  0x0804da0b in find_repeatinfo (alert=0x9f94fe0, recip=0x9f945d0,
create=0) at do_alert.c:1054
#5  0x0804ee44 in clear_interval (alert=0x9f94fe0) at do_alert.c:1602
#6  0x0804ab03 in main (argc=167333856, argv=0x0) at hobbitd_alert.c:495

page.log gives:
2005-11-08 19:55:55 Worker process died with exit code 134, terminating

hobbitlaunch.log gives:
2005-11-08 19:55:55 Task bbpage terminated, status 1

Any clue as to what caused it.

Regards,  Thomas
list Thomas Pedersen · Wed, 09 Nov 2005 14:32:01 +0100 ·
Help !

Doesn't anybody have a clue as to why the alert configuration is causing 
my system to crash ?

Is the IGNORE statement wrong in any way ?

In hope of help, Thomas
quoted from Thomas Pedersen

Thomas wrote:
Hi All,

Just configured some more alerts in my hobbit and I have a hobbid_alert
program crash. I am runnning routermon and today I have tried to exclude
if_stats from sending a SMS during the night. I have configured
hobbit-alerts.cfg as following

PAGE=net-nodes/routers
 IGNORE HOST=$ROUTERS TIME=*:1600:2359 SERVICE=if_stat
 IGNORE HOST=$ROUTERS TIME=*:0000:0800 SERVICE=if_stat
 MAIL admin at localhost

Gdb gives the following

-bash-2.05b$ gdb bin/hobbitd_alert tmp/core.26618
GNU gdb Red Hat Linux (6.1post-1.20040607.52rh)
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and 
you are
welcome to change it and/or distribute copies of it under certain
conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for 
details.
This GDB was configured as "i386-redhat-linux-gnu"...Using host
libthread_db library "/lib/tls/libthread_db.so.1".

Core was generated by `hobbitd_alert
--checkpoint-file=/hobbit/hobbit/server/tmp/alert.chk --checkpoin'.
Program terminated with signal 6, Aborted.
Reading symbols from /lib/libpcre.so.0...done.
Loaded symbols for /lib/libpcre.so.0
Reading symbols from /lib/tls/libc.so.6...done.
Loaded symbols for /lib/tls/libc.so.6
Reading symbols from /lib/ld-linux.so.2...done.
Loaded symbols for /lib/ld-linux.so.2
#0  0x00658cef in raise () from /lib/tls/libc.so.6
(gdb) bt
#0  0x00658cef in raise () from /lib/tls/libc.so.6
#1  0x0065a4f5 in abort () from /lib/tls/libc.so.6
#2  0x08055aee in sigsegv_handler (signum=11) at sig.c:57
#3  <signal handler called>
#4  0x0804da0b in find_repeatinfo (alert=0x9f94fe0, recip=0x9f945d0,
create=0) at do_alert.c:1054
#5  0x0804ee44 in clear_interval (alert=0x9f94fe0) at do_alert.c:1602
#6  0x0804ab03 in main (argc=167333856, argv=0x0) at hobbitd_alert.c:495

page.log gives:
2005-11-08 19:55:55 Worker process died with exit code 134, terminating

hobbitlaunch.log gives:
2005-11-08 19:55:55 Task bbpage terminated, status 1

Any clue as to what caused it.

Regards,  Thomas

list Henrik Størner · Wed, 9 Nov 2005 17:05:47 +0100 ·
quoted from Thomas Pedersen
On Wed, Nov 09, 2005 at 08:45:36AM +0100, Thomas wrote:
Hi All,

Just configured some more alerts in my hobbit and I have a hobbid_alert
program crash. I am runnning routermon and today I have tried to exclude
if_stats from sending a SMS during the night. I have configured
hobbit-alerts.cfg as following

PAGE=net-nodes/routers
 IGNORE HOST=$ROUTERS TIME=*:1600:2359 SERVICE=if_stat
 IGNORE HOST=$ROUTERS TIME=*:0000:0800 SERVICE=if_stat
 MAIL admin at localhost
It looks OK. I assume this is with some recent snapshot ?
quoted from Thomas Pedersen
#1  0x0065a4f5 in abort () from /lib/tls/libc.so.6
#2  0x08055aee in sigsegv_handler (signum=11) at sig.c:57
#3  <signal handler called>
#4  0x0804da0b in find_repeatinfo (alert=0x9f94fe0, recip=0x9f945d0, create=0) at do_alert.c:1054
#5  0x0804ee44 in clear_interval (alert=0x9f94fe0) at do_alert.c:1602
#6  0x0804ab03 in main (argc=167333856, argv=0x0) at hobbitd_alert.c:495
Could you send me (directly) the hobbitd_alert binary, the
core file, and your bb-hosts and hobbit-alerts.cfg files ?

If it's too big for an email, upload it to ftp://ftp.sslug.dk/incoming/ 
and drop me a mail about it.


Regards,
Henrik