Xymon Mailing List Archive search

Core dump when using SCRIPT keyword

8 messages in this thread

list Mike Russo · Wed, 8 Jan 2014 16:13:17 +0000 ·
Hi -
I've got a little problem wih Xymon 4.3.12 xymond_alert dumping core when trying to use the SCRIPT keyword in alerts.cfg.  We're trying to create a script that will contain some custom logic for whether to send alerts based on specific processes or not: we have a situation where we need to disable process checking for one particular process on certain hosts at a certain time of day but want the other process checks to continue and can't find any way of doing this with the standard arguments so we figured we'd have to pass along the checking to a custom script.

However when I try to set up the alert, I get a coredump, and the script does not run.  My alerts.cfg statement is:
HOST=%^rqstp.* SERVICE=procs
                SCRIPT /home/bb/RQTimedProcAlerts.sh user-97733e856702@xymon.invalid DURATION>1m REPEAT=10 RECOVERED COLOR=red TIME=W:0801:1830

I have also tried to eliminate some of those other keywords and gone to the simple:
                SCRIPT /home/bb/RQTimedProcAlerts.sh user-97733e856702@xymon.invalid COLOR=red

But I still experience the coredump. Here is a backtrace:

[bb at bongo server]$ gdb bin/xymond_alert core.19031
GNU gdb (GDB) CentOS (7.0.1-45.el5.centos)
Copyright (C) 2009 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>;
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "i386-redhat-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>;...
Reading symbols from /home/bb/server/bin/xymond_alert...done.
[New Thread 19031]

warning: .dynamic section for "/lib/libpthread.so.0" is not at the expected address

warning: difference appears to be caused by prelink, adjusting expectations

warning: .dynamic section for "/usr/lib/libgssapi_krb5.so.2" is not at the expected address

warning: difference appears to be caused by prelink, adjusting expectations

warning: .dynamic section for "/lib/libdl.so.2" is not at the expected address

warning: difference appears to be caused by prelink, adjusting expectations
Reading symbols from /lib/librt.so.1...(no debugging symbols found)...done.
Loaded symbols for /lib/librt.so.1
Reading symbols from /lib/libssl.so.6...(no debugging symbols found)...done.
Loaded symbols for /lib/libssl.so.6
Reading symbols from /lib/libcrypto.so.6...(no debugging symbols found)...done.
Loaded symbols for /lib/libcrypto.so.6
Reading symbols from /lib/libpcre.so.0...(no debugging symbols found)...done.
Loaded symbols for /lib/libpcre.so.0
Reading symbols from /lib/libc.so.6...(no debugging symbols found)...done.
Loaded symbols for /lib/libc.so.6
Reading symbols from /lib/libpthread.so.0...(no debugging symbols found)...done.
[Thread debugging using libthread_db enabled]
Loaded symbols for /lib/libpthread.so.0
Reading symbols from /lib/ld-linux.so.2...(no debugging symbols found)...done.
Loaded symbols for /lib/ld-linux.so.2
Reading symbols from /usr/lib/libgssapi_krb5.so.2...(no debugging symbols found)...done.
Loaded symbols for /usr/lib/libgssapi_krb5.so.2
Reading symbols from /usr/lib/libkrb5.so.3...(no debugging symbols found)...done.
Loaded symbols for /usr/lib/libkrb5.so.3
Reading symbols from /lib/libcom_err.so.2...(no debugging symbols found)...done.
Loaded symbols for /lib/libcom_err.so.2
Reading symbols from /usr/lib/libk5crypto.so.3...(no debugging symbols found)...done.
Loaded symbols for /usr/lib/libk5crypto.so.3
Reading symbols from /lib/libresolv.so.2...(no debugging symbols found)...done.
Loaded symbols for /lib/libresolv.so.2
Reading symbols from /lib/libdl.so.2...(no debugging symbols found)...done.
Loaded symbols for /lib/libdl.so.2
Reading symbols from /lib/libz.so.1...(no debugging symbols found)...done.
Loaded symbols for /lib/libz.so.1
Reading symbols from /usr/lib/libkrb5support.so.0...(no debugging symbols found)...done.
Loaded symbols for /usr/lib/libkrb5support.so.0
Reading symbols from /lib/libkeyutils.so.1...(no debugging symbols found)...done.
Loaded symbols for /lib/libkeyutils.so.1
Reading symbols from /lib/libselinux.so.1...(no debugging symbols found)...done.
Loaded symbols for /lib/libselinux.so.1
Reading symbols from /lib/libsepol.so.1...(no debugging symbols found)...done.
Loaded symbols for /lib/libsepol.so.1
Reading symbols from /lib/libgcc_s.so.1...(no debugging symbols found)...done.
Loaded symbols for /lib/libgcc_s.so.1
Core was generated by `xymond_alert --checkpoint-file=/home/bb/server/tmp/alert.chk --checkpoint-inter'.
Program terminated with signal 6, Aborted.
#0  0x00dba402 in __kernel_vsyscall ()
(gdb) bt
#0  0x00dba402 in __kernel_vsyscall ()
#1  0x00605e30 in raise () from /lib/libc.so.6
#2  0x00607741 in abort () from /lib/libc.so.6
#3  0x0063e8cb in __libc_message () from /lib/libc.so.6
#4  0x00648f11 in _int_realloc () from /lib/libc.so.6
#5  0x0064aea6 in realloc () from /lib/libc.so.6
#6  0x006088e5 in __add_to_environ () from /lib/libc.so.6
#7  0x00608657 in putenv () from /lib/libc.so.6
#8  0x0804eb49 in send_alert (alert=0x9ab9d68, logfd=0x9ab8250) at do_alert.c:627
#9  0x0804b62d in main (argc=Cannot access memory at address 0x4a57
) at xymond_alert.c:890
(gdb)


--
Michael Russo, ReadQ Systems Inc.
1 Whitehall Street, 16th Floor, NY NY 10004
list Henrik Størner · Wed, 08 Jan 2014 17:49:56 +0100 ·
Den 08-01-2014 17:13, Mike Russo skrev:
HOST=%^rqstp.* SERVICE=procs
                 SCRIPT /home/bb/RQTimedProcAlerts.sh user-97733e856702@xymon.invalid
DURATION>1m REPEAT=10 RECOVERED COLOR=red TIME=W:0801:1830
The criteria doesn't go on the action line. Try

HOST=%^rqstp SERVICE=procs COLOR=red TIME=W:0801:1830 DURATION>1m 
REPEAT=10 RECOVERED
	SCRIPT /home/bb/RQTimeProcAlerts.sh user-97733e856702@xymon.invalid


Regards,
Henrik
list Mike Russo · Wed, 8 Jan 2014 17:11:41 +0000 ·
Really? I have other alerts set up in this way. And in http://xymon.sourceforge.net/xymon/help/xymon-alerts.html it says those keywords are good for "rules and recipients" so I thought they could go either place.

But seconds after I posted this, you announced Xymon 4.3.13, and although I couldn't see anything in the Changes file specifically relating to this, the crash no longer occurs with that version! :) Thank you for looking into this.
signature

--
Michael Russo, ReadQ Systems Inc.
1 Whitehall Street, 16th Floor, NY NY 10004

-----Original Message-----

quoted from Henrik Størner
From: Xymon [mailto:xymon-bounces at xymon.com] On Behalf Of Henrik Størner
Sent: Wednesday, January 08, 2014 11:50 AM
To: xymon at xymon.com
Subject: Re: [Xymon] Core dump when using SCRIPT keyword

Den 08-01-2014 17:13, Mike Russo skrev:
HOST=%^rqstp.* SERVICE=procs
                 SCRIPT /home/bb/RQTimedProcAlerts.sh user-97733e856702@xymon.invalid
DURATION>1m REPEAT=10 RECOVERED COLOR=red TIME=W:0801:1830
The criteria doesn't go on the action line. Try

HOST=%^rqstp SERVICE=procs COLOR=red TIME=W:0801:1830 DURATION>1m 
REPEAT=10 RECOVERED
	SCRIPT /home/bb/RQTimeProcAlerts.sh user-97733e856702@xymon.invalid


Regards,
Henrik


--

This message was scanned by ESVA and is believed to be clean.
list Henrik Størner · Wed, 08 Jan 2014 21:57:29 +0100 ·
quoted from Mike Russo
Den 08-01-2014 18:11, Mike Russo skrev:
Really? I have other alerts set up in this way. And in
http://xymon.sourceforge.net/xymon/help/xymon-alerts.html it says
those keywords are good for "rules and recipients" so I thought they
could go either place.
Not sure, really ... I wouldn't write the rules that way myself, but perhaps I made the code robust enough to handle it ok.
quoted from Mike Russo
But seconds after I posted this, you announced Xymon 4.3.13, and
although I couldn't see anything in the Changes file specifically
relating to this, the crash no longer occurs with that version! :)
Thank you for looking into this.
Honestly, I could not see why it would crash like that. It seems mostly like a problem with putting a new variable into the environment, so maybe it just depends on the amount of data that goes into the message. (Environment size is limited on some platforms). But that is just a guess.


Regards,
Henrik
list John Rothlisberger · Tue, 14 Jan 2014 15:55:14 +0000 ·
I am seeing a similar issue after recently upgrading my OS (Ubuntu 12.04.03LTS) and Xymon to 4.3.13.

I use all external scripts in my alerts file but it seems that the core is being dumped because of "xymond_alert --checkpoint-file=/home/xymon/server/tmp/alert.chk --checkpoint-in"

gdb ../bin/xymond_alert ./core.01142014.0921
GNU gdb (Ubuntu/Linaro 7.4-2012.04-0ubuntu2.1) 7.4-2012.04
Copyright (C) 2012 Free Software Foundation, Inc.
quoted from Mike Russo
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>;
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.

This GDB was configured as "i686-linux-gnu".
For bug reporting instructions, please see:
<http://bugs.launchpad.net/gdb-linaro/>;...
Reading symbols from /home/xymon/server/bin/xymond_alert...done.
[New LWP 32565]

warning: Can't read pathname for load map: Input/output error.
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/i386-linux-gnu/libthread_db.so.1".
Core was generated by `xymond_alert --checkpoint-file=/home/xymon/server/tmp/alert.chk --checkpoint-in'.
Program terminated with signal 6, Aborted.
#0  0xb7793424 in __kernel_vsyscall ()
(gdb) bt
#0  0xb7793424 in __kernel_vsyscall ()
#1  0xb75c21df in raise () from /lib/i386-linux-gnu/libc.so.6
#2  0xb75c5825 in abort () from /lib/i386-linux-gnu/libc.so.6
#3  0x08058e51 in sigsegv_handler (signum=11) at sig.c:57
#4  <signal handler called>
#5  0xb76ce290 in ?? () from /lib/i386-linux-gnu/libc.so.6
#6  0x0805a03e in strbuf_addtobuffer (buf=0x8697070, newtext=<optimized out>, newlen=132)
    at /usr/include/i386-linux-gnu/bits/string3.h:52
#7  0x08057c8f in msg_data (
    msg=0x8738d20 "status ########.msgs yellow Tue Jan 14 09:20:43 2014 - System logs NOT ok\n<pre>\n</pre>\n<pre>\n</pre>\n<pre>\n</pre>\n<pre>\n</pre>\n<pre>\n</pre>\n<pre>\n</pre>\n<pre>\n</pre>\n\n&yellow Warnings in <a href=\"/AT"..., stripcr=1) at misc.c:233
#8  0x0804d4cb in message_text (alert=0x8616ff8, recip=<optimized out>) at do_alert.c:267
#9  0x0804e187 in send_alert (alert=0x8616ff8, logfd=0x86121f0) at do_alert.c:524
#10 0x0804b210 in main (argc=3, argv=0xbfbfd1d4) at xymond_alert.c:901
(gdb)

Thanks,
John
Upcoming PTO:
(none)

John Rothlisberger
IT Strategy, Infrastructure & Security - Technology Growth Platform
TGP for Business Process Outsourcing
Accenture
XXX.XXX.XXXX office
quoted from Mike Russo

From: Xymon [mailto:xymon-bounces at xymon.com] On Behalf Of Mike Russo
Sent: Wednesday, January 08, 2014 10:13 AM
To: xymon at xymon.com
Subject: [Xymon] Core dump when using SCRIPT keyword

Hi -
I've got a little problem wih Xymon 4.3.12 xymond_alert dumping core when trying to use the SCRIPT keyword in alerts.cfg.  We're trying to create a script that will contain some custom logic for whether to send alerts based on specific processes or not: we have a situation where we need to disable process checking for one particular process on certain hosts at a certain time of day but want the other process checks to continue and can't find any way of doing this with the standard arguments so we figured we'd have to pass along the checking to a custom script.

However when I try to set up the alert, I get a coredump, and the script does not run.  My alerts.cfg statement is:
HOST=%^rqstp.* SERVICE=procs

                SCRIPT /home/bb/RQTimedProcAlerts.sh user-97733e856702@xymon.invalid<mailto:user-97733e856702@xymon.invalid> DURATION>1m REPEAT=10 RECOVERED COLOR=red TIME=W:0801:1830

I have also tried to eliminate some of those other keywords and gone to the simple:
                SCRIPT /home/bb/RQTimedProcAlerts.sh user-97733e856702@xymon.invalid<mailto:user-97733e856702@xymon.invalid> COLOR=red
quoted from Mike Russo

But I still experience the coredump. Here is a backtrace:

[bb at bongo server]$ gdb bin/xymond_alert core.19031
GNU gdb (GDB) CentOS (7.0.1-45.el5.centos)
Copyright (C) 2009 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>;
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "i386-redhat-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>;...
Reading symbols from /home/bb/server/bin/xymond_alert...done.
[New Thread 19031]

warning: .dynamic section for "/lib/libpthread.so.0" is not at the expected address

warning: difference appears to be caused by prelink, adjusting expectations

warning: .dynamic section for "/usr/lib/libgssapi_krb5.so.2" is not at the expected address

warning: difference appears to be caused by prelink, adjusting expectations

warning: .dynamic section for "/lib/libdl.so.2" is not at the expected address

warning: difference appears to be caused by prelink, adjusting expectations
Reading symbols from /lib/librt.so.1...(no debugging symbols found)...done.
Loaded symbols for /lib/librt.so.1
Reading symbols from /lib/libssl.so.6...(no debugging symbols found)...done.
Loaded symbols for /lib/libssl.so.6
Reading symbols from /lib/libcrypto.so.6...(no debugging symbols found)...done.
Loaded symbols for /lib/libcrypto.so.6
Reading symbols from /lib/libpcre.so.0...(no debugging symbols found)...done.
Loaded symbols for /lib/libpcre.so.0
Reading symbols from /lib/libc.so.6...(no debugging symbols found)...done.
Loaded symbols for /lib/libc.so.6
Reading symbols from /lib/libpthread.so.0...(no debugging symbols found)...done.
[Thread debugging using libthread_db enabled]
Loaded symbols for /lib/libpthread.so.0
Reading symbols from /lib/ld-linux.so.2...(no debugging symbols found)...done.
Loaded symbols for /lib/ld-linux.so.2
Reading symbols from /usr/lib/libgssapi_krb5.so.2...(no debugging symbols found)...done.
Loaded symbols for /usr/lib/libgssapi_krb5.so.2
Reading symbols from /usr/lib/libkrb5.so.3...(no debugging symbols found)...done.
Loaded symbols for /usr/lib/libkrb5.so.3
Reading symbols from /lib/libcom_err.so.2...(no debugging symbols found)...done.
Loaded symbols for /lib/libcom_err.so.2
Reading symbols from /usr/lib/libk5crypto.so.3...(no debugging symbols found)...done.
Loaded symbols for /usr/lib/libk5crypto.so.3
Reading symbols from /lib/libresolv.so.2...(no debugging symbols found)...done.
Loaded symbols for /lib/libresolv.so.2
Reading symbols from /lib/libdl.so.2...(no debugging symbols found)...done.
Loaded symbols for /lib/libdl.so.2
Reading symbols from /lib/libz.so.1...(no debugging symbols found)...done.
Loaded symbols for /lib/libz.so.1
Reading symbols from /usr/lib/libkrb5support.so.0...(no debugging symbols found)...done.
Loaded symbols for /usr/lib/libkrb5support.so.0
Reading symbols from /lib/libkeyutils.so.1...(no debugging symbols found)...done.
Loaded symbols for /lib/libkeyutils.so.1
Reading symbols from /lib/libselinux.so.1...(no debugging symbols found)...done.
Loaded symbols for /lib/libselinux.so.1
Reading symbols from /lib/libsepol.so.1...(no debugging symbols found)...done.
Loaded symbols for /lib/libsepol.so.1
Reading symbols from /lib/libgcc_s.so.1...(no debugging symbols found)...done.
Loaded symbols for /lib/libgcc_s.so.1
Core was generated by `xymond_alert --checkpoint-file=/home/bb/server/tmp/alert.chk --checkpoint-inter'.
Program terminated with signal 6, Aborted.
#0  0x00dba402 in __kernel_vsyscall ()
(gdb) bt
#0  0x00dba402 in __kernel_vsyscall ()
#1  0x00605e30 in raise () from /lib/libc.so.6
#2  0x00607741 in abort () from /lib/libc.so.6
#3  0x0063e8cb in __libc_message () from /lib/libc.so.6
#4  0x00648f11 in _int_realloc () from /lib/libc.so.6
#5  0x0064aea6 in realloc () from /lib/libc.so.6
#6  0x006088e5 in __add_to_environ () from /lib/libc.so.6
#7  0x00608657 in putenv () from /lib/libc.so.6
#8  0x0804eb49 in send_alert (alert=0x9ab9d68, logfd=0x9ab8250) at do_alert.c:627
#9  0x0804b62d in main (argc=Cannot access memory at address 0x4a57
) at xymond_alert.c:890
(gdb)


--
Michael Russo, ReadQ Systems Inc.
1 Whitehall Street, 16th Floor, NY NY 10004


This message is for the designated recipient only and may contain privileged, proprietary, or otherwise confidential information. If you have received it in error, please notify the sender immediately and delete the original. Any other use of the e-mail by you is prohibited. Where allowed by local law, electronic communications with Accenture and its affiliates, including e-mail and instant messaging (including content), may be scanned by our systems for the purposes of information security and assessment of internal compliance with Accenture policy. .

www.accenture.com
list John Rothlisberger · Fri, 17 Jan 2014 20:30:17 +0000 (UTC) ·
Henrik Størner <henrik at ...> writes:
Den 08-01-2014 17:13, Mike Russo skrev:
HOST=%^rqstp.* SERVICE=procs
                 SCRIPT /home/bb/RQTimedProcAlerts.sh rqit at ...
quoted from Mike Russo
DURATION>1m REPEAT=10 RECOVERED COLOR=red TIME=W:0801:1830
The criteria doesn't go on the action line. Try

HOST=%^rqstp SERVICE=procs COLOR=red TIME=W:0801:1830 DURATION>1m REPEAT=10 RECOVERED

	SCRIPT /home/bb/RQTimeProcAlerts.sh rqit at ...

Regards,
Henrik


Xymon at ...

Odd... I have been putting criteria on the action line (SCRIPT) for years and have never had a problem.

Now, after upgrading from 4.3.10 to 4.3.13 I am getting core dumps from xymond_alert that I have not been able to isolate.

John
list Henrik Størner · Sun, 19 Jan 2014 22:21:02 +0100 ·
quoted from John Rothlisberger
Den 17-01-2014 21:30, John Rothlisberger skrev:
Now, after upgrading from 4.3.10 to 4.3.13 I am getting core dumps from
xymond_alert that I have not been able to isolate.
Seems to be related to the new code in 4.3.13, which strips <cr> from 
message texts, to avoid sending alerts as attachments because the 
mail-programs believe they are binary.

I think the attached diff against 4.3.13 should fix it.

Regards,
Henrik
Attachments (1)
list John Rothlisberger · Mon, 20 Jan 2014 15:03:08 +0000 ·
Thanks, I have applied the patch and will let you know how things go.
quoted from John Rothlisberger

Thanks,
John
Upcoming PTO:
(none)

John Rothlisberger
IT Strategy, Infrastructure & Security - Technology Growth Platform
TGP for Business Process Outsourcing
Accenture
XXX.XXX.XXXX office

-----Original Message-----
From: Xymon [mailto:xymon-bounces at xymon.com] On Behalf Of Henrik Størner
Sent: Sunday, January 19, 2014 3:21 PM
To: xymon at xymon.com
Subject: Re: [Xymon] Core dump when using SCRIPT keyword

Den 17-01-2014 21:30, John Rothlisberger skrev:
Now, after upgrading from 4.3.10 to 4.3.13 I am getting core dumps
from xymond_alert that I have not been able to isolate.
Seems to be related to the new code in 4.3.13, which strips <cr> from message
texts, to avoid sending alerts as attachments because the mail-programs believe
they are binary.

I think the attached diff against 4.3.13 should fix it.

Regards,
Henrik
This message is for the designated recipient only and may contain privileged, proprietary, or otherwise confidential information. If you have received it in error, please notify the sender immediately and delete the original. Any other use of the e-mail by you is prohibited. Where allowed by local law, electronic communications with Accenture and its affiliates, including e-mail and instant messaging (including content), may be scanned by our systems for the purposes of information security and assessment of internal compliance with Accenture policy. .

www.accenture.com