Xymon Mailing List Archive search

Scheduled disable causes crash?

list Johan Sjöberg
Tue, 18 Jan 2011 08:21:54 +0100
Message-Id: <user-1b0573d8080c@xymon.invalid>

Hi.

I have not seen this problem since applying the patch. I can't be sure that it's fixed since it didn't happen every time, but it is looking good.

/Johan
-----Original Message-----
From: Henrik Størner [mailto:user-ce4a2c883f75@xymon.invalid]
Sent: den 6 december 2010 12:41
To: xymon at xymon.com
Subject: Re: [xymon] Scheduled disable causes crash?

In
<user-06299813fc61@xymon.invalid
ement.se> =?iso-8859-1?Q?Johan_Sj=F6berg?=
<user-74c177c1220d@xymon.invalid> writes:
During the last month, we have had some problems with Xymon when
using sche=
duled disabled (added from the web interface).
The first problem we had was on September 17th, when hobbitd crashed
while/=
after running the scheduled disable. We got the following error in hobbit.l=
og
2010-09-17 05:00:00 Fatal error in select: Bad file descriptor
2010-09-17 05:00:00 Setup complete
There is a bug lurking in the scheduled-task code, but I haven't been
able to quite nail down where it is. I've seen the same problem that
you have a couple of times, where a scheduled "disable" results in
xymond (hobbitd) crashing immediately afterwards.

One potential bug I did catch is fixed with the following patch:

Index: xymond/xymond.c
==========================================================
=========
--- xymond/xymond.c	(revision 6604)
+++ xymond/xymond.c	(working copy)
@@ -3971,7 +3971,7 @@
 	if (msg->doingwhat == RESPONDING) {
 		shutdown(msg->sock, SHUT_RD);
 	}
-	else {
+	else if (msg->sock >= 0) {
 		shutdown(msg->sock, SHUT_RDWR);
 		close(msg->sock);
 		msg->sock = -1;
@@ -5040,6 +5040,8 @@
 					swalk
= swalk->next;


	memset(&task, 0, sizeof(task));
• task.sock = -1;
• task.doingwhat = NOTALK;

	inet_aton(runtask->sender, (struct in_addr *)
&task.addr.sin_addr.s_addr);

	task.buf = task.bufp = runtask->command;

	task.buflen = strlen(runtask->command); task.bufsz =
task.buflen+1;


So it would be interesting to see if this helps in your setup. This patch
is against the current beta-3 code, but it applies to version 4.2.3 as
well if you run patch and explicitly tell it which file to patch:

   patch hobbit-4.2.3/hobbitd/hobbitd.c < task.patch


I am not sure if this fixes the problem, though. Because if this is
what causes the crash, then it ought to happen before the log message
that the task ran is written. Unless the bug doesn't crash the system
right away, but only triggers some memory corruption that results in
a later crash ...


Regards,
Henrik