Strange error in alerts.log
list David Boyer
Anybody have any info on this error in ~xymon/server/logs/alert.log? Whoops ! Failed to send message (select(2) failed) -> Select failure while sending to Xymon daemon at 192.168.11.100:1984 -> Recipent '192.168.11.100', timeout 15 -> 1st line: 'config hosts.cfg' Cannot load hosts.cfg from xymond, code 6 Failed to load from xymond, reverting to file-load Thanks, Dave
list Jeremy Laidman
▸
On 16 March 2017 at 06:24, David Boyer <user-a6c09f28d9d2@xymon.invalid> wrote:
Anybody have any info on this error in ~xymon/server/logs/alert.log? Whoops ! Failed to send message (select(2) failed) -> Select failure while sending to Xymon daemon at 192.168.11.100:1984 -> Recipent '192.168.11.100', timeout 15 -> 1st line: 'config hosts.cfg' Cannot load hosts.cfg from xymond, code 6 Failed to load from xymond, reverting to file-load
I'm guessing this is displayed when xymond_alert asks xymond to give it the "hosts.cfg" file contents, but the xymond process is not responding at that time. The fact that you get a timeout after 15 seconds, rather than being refused, suggests that either the xymond daemon was running but wedged, or that a firewall was dropping packets so the socket could not be established in the first place. I'm not a coder, but my guess is that select(2) is called after the socket is up, and so a wedged xymond is more likely. Why would xymond not respond? Perhaps high CPU load, or memory thrashing? Do you see these log messages often? If it's only occasionally, are they all about the same time of day? Can you run this command and see if it gives the hosts.cfg file: $ /path/to/xymon 192.168.11.100 'config hosts.cfg' J
list Jeremy Laidman
David Is xymond_alert running on the same server as xymond? If so, perhaps try setting XYMSRV or XYMSERVERS to 127.0.0.1 and see if that helps. I'm wondering if the problem has something to do with your VM. Does the xymond.log flap messages mention what test was flapping? Cheers Jeremy On 17 March 2017 at 05:01, David Boyer <user-a6c09f28d9d2@xymon.invalid> wrote:
Jeremy,
Yes, it returns hosts.cfg and the contents of the hosts.d
directory. A little more background, as the VM is nat'd and the
route-able address is what is being queried and configured in the
xymonserver.cfg config file. The error surfaced 9 times over night..
The xymond.log has a dozen or so msgs about flapping, xymonnet.log is
empty. The history log just has msgs from the restarting on the server
yesterday about not updating a ext test and color unchanged.
Thanks,
Dave
On Wed, Mar 15, 2017 at 3:41 PM, Jeremy Laidman <user-71895fb2e44c@xymon.invalid>
▸
wrote:On 16 March 2017 at 06:24, David Boyer <user-a6c09f28d9d2@xymon.invalid> wrote:Anybody have any info on this error in ~xymon/server/logs/alert.log? Whoops ! Failed to send message (select(2) failed) -> Select failure while sending to Xymon daemon at 192.168.11.100:1984 -> Recipent '192.168.11.100', timeout 15 -> 1st line: 'config hosts.cfg' Cannot load hosts.cfg from xymond, code 6 Failed to load from xymond, reverting to file-loadI'm guessing this is displayed when xymond_alert asks xymond to give it the "hosts.cfg" file contents, but the xymond process is not responding at that time. The fact that you get a timeout after 15 seconds, rather than being refused, suggests that either the xymond daemon was running but wedged, or that a firewall was dropping packets so the socket could not be established in the first place. I'm not a coder, but my guess is that select(2) is called after the socket is up, and so a wedged xymond is more likely. Why would xymond not respond? Perhaps high CPU load, or memory thrashing? Do you see these log messages often? If it's only occasionally, are they all about the same time of day? Can you run this command and see if it gives the hosts.cfg file: $ /path/to/xymon 192.168.11.100 'config hosts.cfg' J