Two DNS lookups for a server but one fails
list Martin Ward
Hi all,
I have a number of servers that run their own DNS service, each giving
out a single IP address. These DNS servers are monitored in bb-hosts
like this (names and Ips changed to protect the guilty):
1.2.3.4 dns1.server.com # smtp
dns=a:smtp.server.com,ns:smtp.server.com
The issue is that this configuration performs two DNS lookups, one for
an NS record for smtp.server.com and one for and A record for
smtp.server.com. When run either the NS or the A record is returned but
not both. The one that fails shows the following in the web interface:
Service dns on mc20.lon.server.com is not OK : Service unavailable
*** DNS lookup of 'a:smtp.server.com' ***
Timeout (channel destroyed)
In this instance it was the A record that failed but in others it is the
NS record. I always get one of the queries back successfully, but not
both.
These were working fine until I upgraded to Xymon 4.2.2 so this looks
like the culprit. Any ideas or suggestions?
|\/|artin
--
Martin W. Ward
TAC Network Systems Team Leader
COLT
Unit 12, Powergate Business Park
Volt Avenue, Park Royal, London
NW10 6PW, United Kingdom
Tel: + 44 (0)20 7863 5218 Internal: 8 441 5218
Fax: + 44 (0)20 7863 5610
Email: user-2d33a6eb6a05@xymon.invalid
www.colt.net
Data | Voice | Managed Services
*************************************************************************************
The message is intended for the named addressee only and may not be disclosed to or used by anyone else, nor may it be copied in any way.
The contents of this message and its attachments are confidential and may also be subject to legal privilege. If you are not the named addressee and/or have received this message in error, please advise us by e-mailing user-61c7f445d564@xymon.invalid and delete the message and any attachments without retaining any copies.
Internet communications are not secure and COLT does not accept responsibility for this message, its contents nor responsibility for any viruses.
No contracts can be created or varied on behalf of COLT Telecommunications, its subsidiaries or affiliates ("COLT") and any other party by email Communications unless expressly agreed in writing with such other party.
Please note that incoming emails will be automatically scanned to eliminate potential viruses and unsolicited promotional emails. For more information refer to www.colt.net or contact us on +44(0)20 7390 3900.
list Henrik Størner
Hi Martin,
▸
On Mon, Jan 05, 2009 at 01:58:56PM -0000, Ward, Martin wrote:*** DNS lookup of 'a:smtp.server.com' *** Timeout (channel destroyed) In this instance it was the A record that failed but in others it is the NS record. I always get one of the queries back successfully, but not both. These were working fine until I upgraded to Xymon 4.2.2 so this looks like the culprit. Any ideas or suggestions?
there was a change done in 4.2.2 - backported from the 4.3.x code - to fix a bug that could cause the network tests to lockup while doing the DNS lookups. It is probably that "fix" that causes the problem. Going over the DNS code again, I think there's some flawed logic in how it handles the lookups. Could you try the attached version of xymon-4.2.2/bbnet/dns.c ? Just copy it on top of the existing one, then run "make" and copy the resulting xymon-4.2.2/bbnet/bbtest-net binary to your ~xymon/server/bin/ directory (save the existing one just in case this completely breaks stuff). Let me know if that is better. Regards, Henrik
Attachments (1)
list Martin Ward
Hi Henrik, I compiled that in and installed it but it seems to have messed up all the remote port checks. All my ssh port tests, which are initiated from the server, are now purple, as well as the DNS checks, syslog port checks and others besides. Rebuilding with the previous version has restored the remote port checks as well as the dual-DNS-check errors. |\/|artin
▸
-----Original Message----- From: Henrik Størner [mailto:user-ce4a2c883f75@xymon.invalid] Sent: 07 January 2009 13:30 To: user-ae9b8668bcde@xymon.invalid Subject: Re: [hobbit] Two DNS lookups for a server but one fails Hi Martin, On Mon, Jan 05, 2009 at 01:58:56PM -0000, Ward, Martin wrote:*** DNS lookup of 'a:smtp.server.com' *** Timeout (channel destroyed)In this instance it was the A record that failed but in others it is > the NS record. I always get one of the queries back successfully, but > not both. These were working fine until I upgraded to Xymon 4.2.2 so this looks > like the culprit. Any ideas or suggestions?there was a change done in 4.2.2 - backported from the 4.3.x code - to fix a bug that could cause the network tests to lockup while doing the DNS lookups. It is probably that "fix" that causes the problem. Going over the DNS code again, I think there's some flawed logic in how it handles the lookups. Could you try the attached version of xymon-4.2.2/bbnet/dns.c ? Just copy it on top of the existing one, then run "make" and copy the resulting xymon-4.2.2/bbnet/bbtest-net binary to your ~xymon/server/bin/ directory (save the existing one just in case this completely breaks stuff). Let me know if that is better. Regards, Henrik
*************************************************************************************
The message is intended for the named addressee only and may not be disclosed to or used by anyone else, nor may it be copied in any way.
The contents of this message and its attachments are confidential and may also be subject to legal privilege. If you are not the named addressee and/or have received this message in error, please advise us by e-mailing user-61c7f445d564@xymon.invalid and delete the message and any attachments without retaining any copies.
Internet communications are not secure and COLT does not accept responsibility for this message, its contents nor responsibility for any viruses.
No contracts can be created or varied on behalf of COLT Telecommunications, its subsidiaries or affiliates ("COLT") and any other party by email Communications unless expressly agreed in writing with such other party.
Please note that incoming emails will be automatically scanned to eliminate potential viruses and unsolicited promotional emails. For more information refer to www.colt.net or contact us on +44(0)20 7390 3900.
list Johan Sjöberg
Hi. We have been experiencing another DNS check problem since the upgrade to Xymon 4.2.2. Since I upgraded, I sometimes get "Timeout (channel destroyed) Seconds: 4.999" on two DNS servers that are on an offsite location (connected over VPN). The problem started immediately after the update, so I think it is related. This never happened with 4.2.0. Has the timeout been changed in the new version? Anyhow, I compiled and installed the new dns.c and have not experienced any "purple" issues. Now I will just wait and see if the DNS check alerts will continue to appear. /Johan
▸
-----Original Message-----
From: Ward, Martin [mailto:user-2d33a6eb6a05@xymon.invalid] Sent: den 7 januari 2009 16:52
To: user-ae9b8668bcde@xymon.invalid
Subject: RE: [hobbit] Two DNS lookups for a server but one fails
Hi Henrik,
I compiled that in and installed it but it seems to have messed up all the remote port checks. All my ssh port tests, which are initiated from the server, are now purple, as well as the DNS checks, syslog port checks and others besides.
Rebuilding with the previous version has restored the remote port checks as well as the dual-DNS-check errors.
|\/|artin
-----Original Message----- From: Henrik Størner [mailto:user-ce4a2c883f75@xymon.invalid] Sent: 07 January 2009 13:30 To: user-ae9b8668bcde@xymon.invalid Subject: Re: [hobbit] Two DNS lookups for a server but one fails Hi Martin, On Mon, Jan 05, 2009 at 01:58:56PM -0000, Ward, Martin wrote:*** DNS lookup of 'a:smtp.server.com' *** Timeout (channel destroyed)In this instance it was the A record that failed but in others it is > the NS record. I always get one of the queries back successfully, but > not both. These were working fine until I upgraded to Xymon 4.2.2 so this looks > like the culprit. Any ideas or suggestions?there was a change done in 4.2.2 - backported from the 4.3.x code - to fix a bug that could cause the network tests to lockup while doing the DNS lookups. It is probably that "fix" that causes the problem. Going over the DNS code again, I think there's some flawed logic in how it handles the lookups. Could you try the attached version of xymon-4.2.2/bbnet/dns.c ? Just copy it on top of the existing one, then run "make" and copy the resulting xymon-4.2.2/bbnet/bbtest-net binary to your ~xymon/server/bin/ directory (save the existing one just in case this completely breaks stuff). Let me know if that is better. Regards, Henrik
*************************************************************************************
The message is intended for the named addressee only and may not be disclosed to or used by anyone else, nor may it be copied in any way.
The contents of this message and its attachments are confidential and may also be subject to legal privilege. If you are not the named addressee and/or have received this message in error, please advise us by e-mailing user-61c7f445d564@xymon.invalid and delete the message and any attachments without retaining any copies.
Internet communications are not secure and COLT does not accept responsibility for this message, its contents nor responsibility for any viruses.
No contracts can be created or varied on behalf of COLT Telecommunications, its subsidiaries or affiliates ("COLT") and any other party by email Communications unless expressly agreed in writing with such other party.
Please note that incoming emails will be automatically scanned to eliminate potential viruses and unsolicited promotional emails. For more information refer to www.colt.net or contact us on +44(0)20 7390 3900.
list Johan Sjöberg
This night, after installing the new bbtest-net, we received an alarm on bbtest for the Xymon server, saying " - Program crashed Fatal signal caught!" From hobbitlaunch.log: "2009-01-08 05:05:07 Task bbnet terminated by signal 6" /Johan
▸
-----Original Message-----
From: Johan Sjöberg [mailto:user-74c177c1220d@xymon.invalid] Sent: den 7 januari 2009 17:01
To: user-ae9b8668bcde@xymon.invalid
Subject: RE: [hobbit] Two DNS lookups for a server but one fails
Hi.
We have been experiencing another DNS check problem since the upgrade to Xymon 4.2.2. Since I upgraded, I sometimes get "Timeout (channel destroyed) Seconds: 4.999" on two DNS servers that are on an offsite location (connected over VPN). The problem started immediately after the update, so I think it is related. This never happened with 4.2.0. Has the timeout been changed in the new version?
Anyhow, I compiled and installed the new dns.c and have not experienced any "purple" issues. Now I will just wait and see if the DNS check alerts will continue to appear.
/Johan
-----Original Message-----
From: Ward, Martin [mailto:user-2d33a6eb6a05@xymon.invalid] Sent: den 7 januari 2009 16:52
To: user-ae9b8668bcde@xymon.invalid
Subject: RE: [hobbit] Two DNS lookups for a server but one fails
Hi Henrik,
I compiled that in and installed it but it seems to have messed up all the remote port checks. All my ssh port tests, which are initiated from the server, are now purple, as well as the DNS checks, syslog port checks and others besides.
Rebuilding with the previous version has restored the remote port checks as well as the dual-DNS-check errors.
|\/|artin
-----Original Message----- From: Henrik Størner [mailto:user-ce4a2c883f75@xymon.invalid] Sent: 07 January 2009 13:30 To: user-ae9b8668bcde@xymon.invalid Subject: Re: [hobbit] Two DNS lookups for a server but one fails Hi Martin, On Mon, Jan 05, 2009 at 01:58:56PM -0000, Ward, Martin wrote:*** DNS lookup of 'a:smtp.server.com' *** Timeout (channel destroyed)In this instance it was the A record that failed but in others it is > the NS record. I always get one of the queries back successfully, but > not both. These were working fine until I upgraded to Xymon 4.2.2 so this looks > like the culprit. Any ideas or suggestions?there was a change done in 4.2.2 - backported from the 4.3.x code - to fix a bug that could cause the network tests to lockup while doing the DNS lookups. It is probably that "fix" that causes the problem. Going over the DNS code again, I think there's some flawed logic in how it handles the lookups. Could you try the attached version of xymon-4.2.2/bbnet/dns.c ? Just copy it on top of the existing one, then run "make" and copy the resulting xymon-4.2.2/bbnet/bbtest-net binary to your ~xymon/server/bin/ directory (save the existing one just in case this completely breaks stuff). Let me know if that is better. Regards, Henrik
*************************************************************************************
The message is intended for the named addressee only and may not be disclosed to or used by anyone else, nor may it be copied in any way.
The contents of this message and its attachments are confidential and may also be subject to legal privilege. If you are not the named addressee and/or have received this message in error, please advise us by e-mailing user-61c7f445d564@xymon.invalid and delete the message and any attachments without retaining any copies.
Internet communications are not secure and COLT does not accept responsibility for this message, its contents nor responsibility for any viruses.
No contracts can be created or varied on behalf of COLT Telecommunications, its subsidiaries or affiliates ("COLT") and any other party by email Communications unless expressly agreed in writing with such other party.
Please note that incoming emails will be automatically scanned to eliminate potential viruses and unsolicited promotional emails. For more information refer to www.colt.net or contact us on +44(0)20 7390 3900.
list Lars Ebeling
----- Original Message ----- From: "Johan Sjöberg" <user-74c177c1220d@xymon.invalid> To: <user-ae9b8668bcde@xymon.invalid> Sent: Thursday, January 08, 2009 9:04 AM
▸
Subject: RE: [hobbit] Two DNS lookups for a server but one fails
This night, after installing the new bbtest-net, we received an alarm on bbtest for the Xymon server, saying " - Program crashed Fatal signal caught!"
From hobbitlaunch.log: "2009-01-08 05:05:07 Task bbnet terminated by signal 6"
/Johan
If the program crash, you should have a coredump. Run gdb on the coredump and post the result here, for Henrik to look at. -- Regards Lars Ebeling http://leopg9.no-ip.org Hobbithobbyist "I am not young enough to know everything." -- Oscar Wilde
list Johan Sjöberg
Where should that core dump be located? There is no dump located in the server/bin/ directory :( /Johan
▸
-----Original Message-----
From: Lars Ebeling [mailto:user-1fecd3eafd52@xymon.invalid] Sent: den 8 januari 2009 09:24
To: user-ae9b8668bcde@xymon.invalid
Subject: Re: [hobbit] Two DNS lookups for a server but one fails
----- Original Message ----- From: "Johan Sjöberg" <user-74c177c1220d@xymon.invalid>
To: <user-ae9b8668bcde@xymon.invalid>
Sent: Thursday, January 08, 2009 9:04 AM
Subject: RE: [hobbit] Two DNS lookups for a server but one fails
This night, after installing the new bbtest-net, we received an alarm on bbtest for the Xymon server, saying " - Program crashed Fatal signal caught!"
From hobbitlaunch.log: "2009-01-08 05:05:07 Task bbnet terminated by signal 6"
/Johan If the program crash, you should have a coredump. Run gdb on the coredump and post the result here, for Henrik to look at. -- Regards Lars Ebeling http://leopg9.no-ip.org Hobbithobbyist "I am not young enough to know everything." -- Oscar Wilde
list Lars Ebeling
run find find / -name "core*" -print Lars
▸
----- Original Message -----
From: "Johan Sjöberg" <user-74c177c1220d@xymon.invalid>
To: <user-ae9b8668bcde@xymon.invalid>
Sent: Thursday, January 08, 2009 9:38 AM
Subject: RE: [hobbit] Two DNS lookups for a server but one fails
Where should that core dump be located? There is no dump located in the
server/bin/ directory :(
/Johan
-----Original Message-----
From: Lars Ebeling [mailto:user-1fecd3eafd52@xymon.invalid]
Sent: den 8 januari 2009 09:24
To: user-ae9b8668bcde@xymon.invalid
Subject: Re: [hobbit] Two DNS lookups for a server but one fails
----- Original Message -----
From: "Johan Sjöberg" <user-74c177c1220d@xymon.invalid>
To: <user-ae9b8668bcde@xymon.invalid>
Sent: Thursday, January 08, 2009 9:04 AM
Subject: RE: [hobbit] Two DNS lookups for a server but one fails
This night, after installing the new bbtest-net, we received an alarm on
bbtest for the Xymon server, saying " - Program crashed Fatal signal
caught!"
From hobbitlaunch.log: "2009-01-08 05:05:07 Task bbnet terminated by signal6"
/Johan If the program crash, you should have a coredump. Run gdb on the coredump and post the result here, for Henrik to look at. -- Regards Lars Ebeling http://leopg9.no-ip.org Hobbithobbyist "I am not young enough to know everything." -- Oscar Wilde
list Johan Sjöberg
Hi. I think I was able to run gdb the correct way... Here is the output: # gdb ../bin/bbtest-net core GNU gdb 6.4.90-debian Copyright (C) 2006 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "i486-linux-gnu"...Using host libthread_db library "/lib/tls/i686/cmov/libthread_db.so.1". warning: Can't read pathname for load map: Input/output error. Reading symbols from /usr/lib/i686/cmov/libssl.so.0.9.8...done. Loaded symbols for /usr/lib/i686/cmov/libssl.so.0.9.8 Reading symbols from /usr/lib/i686/cmov/libcrypto.so.0.9.8...done. Loaded symbols for /usr/lib/i686/cmov/libcrypto.so.0.9.8 Reading symbols from /lib/tls/i686/cmov/libc.so.6...done. Loaded symbols for /lib/tls/i686/cmov/libc.so.6 Reading symbols from /lib/tls/i686/cmov/libdl.so.2...done. Loaded symbols for /lib/tls/i686/cmov/libdl.so.2 Reading symbols from /usr/lib/libz.so.1...done. Loaded symbols for /usr/lib/libz.so.1 Reading symbols from /lib/ld-linux.so.2...done. Loaded symbols for /lib/ld-linux.so.2 Reading symbols from /lib/tls/i686/cmov/libnss_files.so.2...done. Loaded symbols for /lib/tls/i686/cmov/libnss_files.so.2 Core was generated by `bbtest-net --report --ping --checkresponse'. Program terminated with signal 6, Aborted. #0 0xb7efa410 in ?? ()
▸
/Johan
-----Original Message-----
From: Lars Ebeling [mailto:user-1fecd3eafd52@xymon.invalid]
Sent: den 8 januari 2009 13:18
To: user-ae9b8668bcde@xymon.invalid
Subject: Re: [hobbit] Two DNS lookups for a server but one fails
run find
find / -name "core*" -print
Lars
----- Original Message -----
From: "Johan Sjöberg" <user-74c177c1220d@xymon.invalid>
To: <user-ae9b8668bcde@xymon.invalid>
Sent: Thursday, January 08, 2009 9:38 AM
Subject: RE: [hobbit] Two DNS lookups for a server but one fails
Where should that core dump be located? There is no dump located in the
server/bin/ directory :(
/Johan
-----Original Message-----
From: Lars Ebeling [mailto:user-1fecd3eafd52@xymon.invalid]
Sent: den 8 januari 2009 09:24
To: user-ae9b8668bcde@xymon.invalid
Subject: Re: [hobbit] Two DNS lookups for a server but one fails
----- Original Message -----
From: "Johan Sjöberg" <user-74c177c1220d@xymon.invalid>
To: <user-ae9b8668bcde@xymon.invalid>
Sent: Thursday, January 08, 2009 9:04 AM
Subject: RE: [hobbit] Two DNS lookups for a server but one fails
This night, after installing the new bbtest-net, we received an alarm on
bbtest for the Xymon server, saying " - Program crashed Fatal signal
caught!"
From hobbitlaunch.log: "2009-01-08 05:05:07 Task bbnet terminated by signal6"
/Johan If the program crash, you should have a coredump. Run gdb on the coredump and post the result here, for Henrik to look at. -- Regards Lars Ebeling http://leopg9.no-ip.org Hobbithobbyist "I am not young enough to know everything." -- Oscar Wilde
list dOCtoR MADneSs
On Thu, 8 Jan 2009 09:24:29 +0100, "Lars Ebeling"
▸
<user-1fecd3eafd52@xymon.invalid> wrote:----- Original Message ----- From: "Johan Sjöberg" <user-74c177c1220d@xymon.invalid> To: <user-ae9b8668bcde@xymon.invalid> Sent: Thursday, January 08, 2009 9:04 AM Subject: RE: [hobbit] Two DNS lookups for a server but one fails This night, after installing the new bbtest-net, we received an alarm on bbtest for the Xymon server, saying " - Program crashed Fatal signal caught!" From hobbitlaunch.log: "2009-01-08 05:05:07 Task bbnet terminated bysignal 6"/Johan If the program crash, you should have a coredump. Run gdb on the coredump
and post the result here, for Henrik to look at. -- Regards Lars Ebeling http://leopg9.no-ip.org Hobbithobbyist "I am not young enough to know everything." -- Oscar Wilde
Hi, Same issue on my side, i used the new compiled bbtest-net using corrected dns.c All is OK, but i got same issue with same error message.
list Vernon Everett
This looks remarkably like the error we were experiencing.
What are you running on?
Do you have the case where all the network tests - ping, http, https, ftp etc. go purple?
If so it might be the same.
With Henrik's assistance, we resolved it down to a problem with the ARES resolver.
I changed our hobbitlaunch.cfg file to include the --no-ares option (see below), and we have never had the issue again.
[bbnet]
ENVFILE /usr/lib/hobbit/server/etc/hobbitserver.cfg
NEEDS hobbitd
CMD bbtest-net --report --ping --checkresponse --no-ares
LOGFILE $BBSERVERLOGS/bb-network.log
INTERVAL 5m
(Of course this does come with a caveat. See bbtest-net man page in Xymon docs http://www.xymon.com/hobbit/help/manpages/man1/bbtest-net.1.html )
YMMV.
Cheers
V
▸
-----Original Message-----
From: user-d54077869176@xymon.invalid [mailto:user-d54077869176@xymon.invalid] Sent: Friday, 9 January 2009 3:34 PM
To: user-ae9b8668bcde@xymon.invalid
Subject: Re: [hobbit] Two DNS lookups for a server but one fails
On Thu, 8 Jan 2009 09:24:29 +0100, "Lars Ebeling"
<user-1fecd3eafd52@xymon.invalid> wrote:----- Original Message ----- From: "Johan Sjöberg" <user-74c177c1220d@xymon.invalid> To: <user-ae9b8668bcde@xymon.invalid> Sent: Thursday, January 08, 2009 9:04 AM Subject: RE: [hobbit] Two DNS lookups for a server but one fails This night, after installing the new bbtest-net, we received an alarm on bbtest for the Xymon server, saying " - Program crashed Fatal signal caught!" From hobbitlaunch.log: "2009-01-08 05:05:07 Task bbnet terminated bysignal 6"/Johan If the program crash, you should have a coredump. Run gdb on the coredump
and post the result here, for Henrik to look at. -- Regards Lars Ebeling http://leopg9.no-ip.org Hobbithobbyist "I am not young enough to know everything." -- Oscar Wilde
Hi, Same issue on my side, i used the new compiled bbtest-net using corrected dns.c All is OK, but i got same issue with same error message.
NOTICE: This email and any attachments are confidential. They may contain legally privileged information or copyright material. You must not read, copy, use or disclose them without authorisation. If you are not an intended recipient, please contact us at once by return email and then delete both messages and all attachments.
list dOCtoR MADneSs
On Fri, 9 Jan 2009 15:43:58 +0900, "Everett, Vernon"
▸
<user-9da1a1882f49@xymon.invalid> wrote:This looks remarkably like the error we were experiencing. What are you running on? Do you have the case where all the network tests - ping, http, https, ftp etc. go purple? If so it might be the same. With Henrik's assistance, we resolved it down to a problem with the ARES resolver. I changed our hobbitlaunch.cfg file to include the --no-ares option (see below), and we have never had the issue again. [bbnet] ENVFILE /usr/lib/hobbit/server/etc/hobbitserver.cfg NEEDS hobbitd CMD bbtest-net --report --ping --checkresponse --no-ares LOGFILE $BBSERVERLOGS/bb-network.log INTERVAL 5m (Of course this does come with a caveat. See bbtest-net man page in Xymon docs http://www.xymon.com/hobbit/help/manpages/man1/bbtest-net.1.html ) YMMV. Cheers V -----Original Message----- From: user-d54077869176@xymon.invalid [mailto:user-d54077869176@xymon.invalid] Sent: Friday, 9 January 2009 3:34 PM To: user-ae9b8668bcde@xymon.invalid Subject: Re: [hobbit] Two DNS lookups for a server but one fails On Thu, 8 Jan 2009 09:24:29 +0100, "Lars Ebeling" <user-1fecd3eafd52@xymon.invalid> wrote:----- Original Message ----- From: "Johan Sjöberg" <user-74c177c1220d@xymon.invalid> To: <user-ae9b8668bcde@xymon.invalid> Sent: Thursday, January 08, 2009 9:04 AM Subject: RE: [hobbit] Two DNS lookups for a server but one fails This night, after installing the new bbtest-net, we received an alarm on bbtest for the Xymon server, saying " - Program crashed Fatal signal caught!" From hobbitlaunch.log: "2009-01-08 05:05:07 Task bbnet terminated bysignal 6"/Johan If the program crash, you should have a coredump. Run gdb on the coredumpand post the result here, for Henrik to look at. -- Regards Lars Ebeling http://leopg9.no-ip.org Hobbithobbyist "I am not young enough to know everything." -- Oscar WildeHi, Same issue on my side, i used the new compiled bbtest-net using corrected dns.c All is OK, but i got same issue with same error message. NOTICE: This email and any attachments are confidential. They may contain legally privileged information or copyright material. You must not read, copy, use or disclose them without authorisation. If you are not an intended recipient, please contact us at once by return email and then delete both messages and all attachments.
Hi, I had purple on all the net tests (ping,http, https...) Your solution to use the --no-ares option is with the recompiled bbtest-net or with the original one ? (in fact , your solution fix the DNS test problem, or the massive purples one ?)
list Vernon Everett
It fixed both problems. It worked with the bbtest-net as it was. --no-ares is a standard option.
▸
-----Original Message-----
From: user-d54077869176@xymon.invalid [mailto:user-d54077869176@xymon.invalid] Sent: Friday, 9 January 2009 3:59 PM
To: user-ae9b8668bcde@xymon.invalid
Subject: RE: [hobbit] Two DNS lookups for a server but one fails
On Fri, 9 Jan 2009 15:43:58 +0900, "Everett, Vernon"
<user-9da1a1882f49@xymon.invalid> wrote:This looks remarkably like the error we were experiencing. What are you running on? Do you have the case where all the network tests - ping, http, https, ftp etc. go purple? If so it might be the same. With Henrik's assistance, we resolved it down to a problem with the ARES resolver. I changed our hobbitlaunch.cfg file to include the --no-ares option (see below), and we have never had the issue again. [bbnet] ENVFILE /usr/lib/hobbit/server/etc/hobbitserver.cfg NEEDS hobbitd CMD bbtest-net --report --ping --checkresponse --no-ares LOGFILE $BBSERVERLOGS/bb-network.log INTERVAL 5m (Of course this does come with a caveat. See bbtest-net man page in Xymon docs http://www.xymon.com/hobbit/help/manpages/man1/bbtest-net.1.html ) YMMV. Cheers V -----Original Message----- From: user-d54077869176@xymon.invalid [mailto:user-d54077869176@xymon.invalid] Sent: Friday, 9 January 2009 3:34 PM To: user-ae9b8668bcde@xymon.invalid Subject: Re: [hobbit] Two DNS lookups for a server but one fails On Thu, 8 Jan 2009 09:24:29 +0100, "Lars Ebeling" <user-1fecd3eafd52@xymon.invalid> wrote:----- Original Message ----- From: "Johan Sjöberg" <user-74c177c1220d@xymon.invalid> To: <user-ae9b8668bcde@xymon.invalid> Sent: Thursday, January 08, 2009 9:04 AM Subject: RE: [hobbit] Two DNS lookups for a server but one fails This night, after installing the new bbtest-net, we received an alarm on bbtest for the Xymon server, saying " - Program crashed Fatal signal caught!" From hobbitlaunch.log: "2009-01-08 05:05:07 Task bbnet terminated bysignal 6"/Johan If the program crash, you should have a coredump. Run gdb on the coredumpand post the result here, for Henrik to look at. -- Regards Lars Ebeling http://leopg9.no-ip.org Hobbithobbyist "I am not young enough to know everything." -- Oscar WildeHi, Same issue on my side, i used the new compiled bbtest-net using corrected dns.c All is OK, but i got same issue with same error message. NOTICE: This email and any attachments are confidential. They may contain legally privileged information or copyright material. You must not read, copy, use or disclose them without authorisation. If you are not an intended recipient, please contact us at once by return email and then delete both messages and all attachments.
Hi, I had purple on all the net tests (ping,http, https...) Your solution to use the --no-ares option is with the recompiled bbtest-net or with the original one ? (in fact , your solution fix the DNS test problem, or the massive purples one ?) NOTICE: This email and any attachments are confidential. They may contain legally privileged information or copyright material. You must not read, copy, use or disclose them without authorisation. If you are not an intended recipient, please contact us at once by return email and then delete both messages and all attachments.
list Johan Sjöberg
We did not experience any "purple" problems, only this single crash of the bbtest-net. But I suppose that several consecutive crashes would have caused the tests to go purple. /Johan
▸
-----Original Message----- From: Everett, Vernon [mailto:user-9da1a1882f49@xymon.invalid] Sent: den 9 januari 2009 07:44 To: user-ae9b8668bcde@xymon.invalid Subject: RE: [hobbit] Two DNS lookups for a server but one fails This looks remarkably like the error we were experiencing. What are you running on? Do you have the case where all the network tests - ping, http, https, ftp etc. go purple? If so it might be the same. With Henrik's assistance, we resolved it down to a problem with the ARES resolver. I changed our hobbitlaunch.cfg file to include the --no-ares option (see below), and we have never had the issue again. [bbnet] ENVFILE /usr/lib/hobbit/server/etc/hobbitserver.cfg NEEDS hobbitd CMD bbtest-net --report --ping --checkresponse --no-ares LOGFILE $BBSERVERLOGS/bb-network.log INTERVAL 5m (Of course this does come with a caveat. See bbtest-net man page in Xymon docs http://www.xymon.com/hobbit/help/manpages/man1/bbtest-net.1.html ) YMMV. Cheers V -----Original Message----- From: user-d54077869176@xymon.invalid [mailto:user-d54077869176@xymon.invalid] Sent: Friday, 9 January 2009 3:34 PM To: user-ae9b8668bcde@xymon.invalid Subject: Re: [hobbit] Two DNS lookups for a server but one fails On Thu, 8 Jan 2009 09:24:29 +0100, "Lars Ebeling" <user-1fecd3eafd52@xymon.invalid> wrote:
----- Original Message ----- From: "Johan Sjöberg" <user-74c177c1220d@xymon.invalid> To: <user-ae9b8668bcde@xymon.invalid> Sent: Thursday, January 08, 2009 9:04 AM Subject: RE: [hobbit] Two DNS lookups for a server but one fails This night, after installing the new bbtest-net, we received an alarm on bbtest for the Xymon server, saying " - Program crashed Fatal signal caught!" From hobbitlaunch.log: "2009-01-08 05:05:07 Task bbnet terminated bysignal 6"/Johan If the program crash, you should have a coredump. Run gdb on the coredump
and post the result here, for Henrik to look at. -- Regards Lars Ebeling http://leopg9.no-ip.org Hobbithobbyist "I am not young enough to know everything." -- Oscar Wilde
Hi, Same issue on my side, i used the new compiled bbtest-net using corrected dns.c All is OK, but i got same issue with same error message. NOTICE: This email and any attachments are confidential. They may contain legally privileged information or copyright material. You must not read, copy, use or disclose them without authorisation. If you are not an intended recipient, please contact us at once by return email and then delete both messages and all attachments.
list dOCtoR MADneSs
On Fri, 9 Jan 2009 16:16:02 +0900, "Everett, Vernon"
▸
<user-9da1a1882f49@xymon.invalid> wrote:It fixed both problems. It worked with the bbtest-net as it was. --no-ares is a standard option. -----Original Message----- From: user-d54077869176@xymon.invalid [mailto:user-d54077869176@xymon.invalid] Sent: Friday, 9 January 2009 3:59 PM To: user-ae9b8668bcde@xymon.invalid Subject: RE: [hobbit] Two DNS lookups for a server but one fails On Fri, 9 Jan 2009 15:43:58 +0900, "Everett, Vernon" <user-9da1a1882f49@xymon.invalid> wrote:This looks remarkably like the error we were experiencing. What are you running on? Do you have the case where all the network tests - ping, http, https, ftp etc. go purple? If so it might be the same. With Henrik's assistance, we resolved it down to a problem with the ARES resolver. I changed our hobbitlaunch.cfg file to include the --no-ares option (see below), and we have never had the issue again. [bbnet] ENVFILE /usr/lib/hobbit/server/etc/hobbitserver.cfg NEEDS hobbitd CMD bbtest-net --report --ping --checkresponse --no-ares LOGFILE $BBSERVERLOGS/bb-network.log INTERVAL 5m (Of course this does come with a caveat. See bbtest-net man page in Xymon docs http://www.xymon.com/hobbit/help/manpages/man1/bbtest-net.1.html ) YMMV. Cheers V -----Original Message----- From: user-d54077869176@xymon.invalid [mailto:user-d54077869176@xymon.invalid] Sent: Friday, 9 January 2009 3:34 PM To: user-ae9b8668bcde@xymon.invalid Subject: Re: [hobbit] Two DNS lookups for a server but one fails On Thu, 8 Jan 2009 09:24:29 +0100, "Lars Ebeling" <user-1fecd3eafd52@xymon.invalid> wrote:----- Original Message ----- From: "Johan Sjöberg" <user-74c177c1220d@xymon.invalid> To: <user-ae9b8668bcde@xymon.invalid> Sent: Thursday, January 08, 2009 9:04 AM Subject: RE: [hobbit] Two DNS lookups for a server but one fails This night, after installing the new bbtest-net, we received an alarm on bbtest for the Xymon server, saying " - Program crashed Fatal signal caught!" From hobbitlaunch.log: "2009-01-08 05:05:07 Task bbnet terminated bysignal 6"/Johan If the program crash, you should have a coredump. Run gdb on the coredumpand post the result here, for Henrik to look at. -- Regards Lars Ebeling http://leopg9.no-ip.org Hobbithobbyist "I am not young enough to know everything." -- Oscar WildeHi, Same issue on my side, i used the new compiled bbtest-net using corrected dns.c All is OK, but i got same issue with same error message. NOTICE: This email and any attachments are confidential. They may contain legally privileged information or copyright material. You must not read, copy, use or disclose them without authorisation. If you are not an intended recipient, please contact us at once by return email and then delete both messages and all attachments.Hi, I had purple on all the net tests (ping,http, https...) Your solution to use the --no-ares option is with the recompiled bbtest-net or with the original one ? (in fact , your solution fix the DNS test problem, or the massive purples one ?) NOTICE: This email and any attachments are confidential. They may contain legally privileged information or copyright material. You must not read, copy, use or disclose them without authorisation. If you are not an intended recipient, please contact us at once by return email and then delete both messages and all attachments.
The use of --no-ares does not change the errors "Timeout (channel destroyed)" when using original bbtest-net binary. When using the corrected one, seems to work. I'll wait to see if the network tests still alive and keep you informed.
list Martin Ward
Johan, The "purple problems" are a different view of this same error. Having experienced both I can say that you either get bb-net to crash or you find that all your remote connection tests (where bbtest-net verifies if a remote port is accessible) turn purple. Possibly you may get both at the same time, I haven't noticed that yet. I can confirm that using the new dns.c code bbtest-net crashes on my Solaris 10 system whether I use the --no-ares option or not. 8-( |\/|artin
▸
-----Original Message----- From: Johan Sjöberg [mailto:user-74c177c1220d@xymon.invalid] Sent: 09 January 2009 08:49 To: user-ae9b8668bcde@xymon.invalid Subject: RE: [hobbit] Two DNS lookups for a server but one fails We did not experience any "purple" problems, only this single crash of the bbtest-net. But I suppose that several consecutive crashes would have caused the tests to go purple. /Johan -----Original Message----- From: Everett, Vernon [mailto:user-9da1a1882f49@xymon.invalid] Sent: den 9 januari 2009 07:44 To: user-ae9b8668bcde@xymon.invalid Subject: RE: [hobbit] Two DNS lookups for a server but one fails This looks remarkably like the error we were experiencing. What are you running on? Do you have the case where all the network tests - ping, http, https, ftp etc. go purple? If so it might be the same. With Henrik's assistance, we resolved it down to a problem with the ARES resolver. I changed our hobbitlaunch.cfg file to include the --no-ares option (see below), and we have never had the issue again. [bbnet] ENVFILE /usr/lib/hobbit/server/etc/hobbitserver.cfg NEEDS hobbitd CMD bbtest-net --report --ping --checkresponse --no-ares LOGFILE $BBSERVERLOGS/bb-network.log INTERVAL 5m (Of course this does come with a caveat. See bbtest-net man page in Xymon docs http://www.xymon.com/hobbit/help/manpages/man1/bbtest-net.1.html ) YMMV. Cheers V -----Original Message----- From: user-d54077869176@xymon.invalid [mailto:user-d54077869176@xymon.invalid] Sent: Friday, 9 January 2009 3:34 PM To: user-ae9b8668bcde@xymon.invalid Subject: Re: [hobbit] Two DNS lookups for a server but one fails On Thu, 8 Jan 2009 09:24:29 +0100, "Lars Ebeling" <user-1fecd3eafd52@xymon.invalid> wrote:----- Original Message ----- From: "Johan Sjöberg" <user-74c177c1220d@xymon.invalid> To: <user-ae9b8668bcde@xymon.invalid> Sent: Thursday, January 08, 2009 9:04 AM Subject: RE: [hobbit] Two DNS lookups for a server but one failsThis night, after installing the new bbtest-net, we received an alarm on bbtest for the Xymon server, saying " - Program crashed Fatal > signal caught!"From hobbitlaunch.log: "2009-01-08 05:05:07 Task bbnet terminated by signal 6" /JohanIf the program crash, you should have a coredump. Run gdb on the coredumpand post the result here, for Henrik to look at.--Regards Lars Ebelinghttp://leopg9.no-ip.orgHobbithobbyist"I am not young enough to know everything."-- Oscar WildeHi, Same issue on my side, i used the new compiled bbtest-net using corrected dns.c All is OK, but i got same issue with same error message. NOTICE: This email and any attachments are confidential. They may contain legally privileged information or copyright material. You must not read, copy, use or disclose them without authorisation. If you are not an intended recipient, please contact us at once by return email and then delete both messages and all attachments.
*************************************************************************************
The message is intended for the named addressee only and may not be disclosed to or used by anyone else, nor may it be copied in any way.
The contents of this message and its attachments are confidential and may also be subject to legal privilege. If you are not the named addressee and/or have received this message in error, please advise us by e-mailing user-61c7f445d564@xymon.invalid and delete the message and any attachments without retaining any copies.
Internet communications are not secure and COLT does not accept responsibility for this message, its contents nor responsibility for any viruses.
No contracts can be created or varied on behalf of COLT Telecommunications, its subsidiaries or affiliates ("COLT") and any other party by email Communications unless expressly agreed in writing with such other party.
Please note that incoming emails will be automatically scanned to eliminate potential viruses and unsolicited promotional emails. For more information refer to www.colt.net or contact us on +44(0)20 7390 3900.
list Martin Ward
OK, I found a core file and have gleaned the following (I removed the symbol load messages):
Core was generated by `bbtest-net --report --ping --checkresponse --no-ares'.
Program terminated with signal 6, Aborted.
#0 0xfec64a27 in _lwp_kill () from /lib/libc.so.1
(gdb) bt
#0 0xfec64a27 in _lwp_kill () from /lib/libc.so.1
#1 0xfec621d4 in thr_kill () from /lib/libc.so.1
#2 0xfec111c7 in raise () from /lib/libc.so.1
#3 0xfebf15d9 in abort () from /lib/libc.so.1
#4 0x0806cbea in sigsegv_handler (signum=11) at sig.c:52
#5 0xfec63def in __sighndlr () from /lib/libc.so.1
#6 0xfec5a292 in call_user_handler () from /lib/libc.so.1
#7 <signal handler called>
#8 0x0806db51 in strbuf_addtobuffer (buf=0x10,
newtext=0x8106960 "@\f\a\bÀX\022\b\b", newlen=135419976) at strfunc.c:100
#9 0x08061783 in dns_detail_callback (arg=0x812ee90, status=16, abuf=0x0,
alen=0) at dns2.c:215
#10 0x08070ba0 in end_squery (squery=0x81258c0, status=134700378, abuf=0x0,
alen=0) at ares_search.c:185
#11 0x08070c6f in search_callback (arg=0x81258c0, status=134700378, abuf=0x0,
alen=0) at ares_search.c:179
#12 0x08072fac in qcallback (arg=0x8106960, status=16, abuf=0x0, alen=0)
at ares_query.c:110
#13 0x080728b4 in ares_destroy (channel=0x8125848) at ares_destroy.c:40
#14 0x08060ff3 in dns_test_server (serverip=0x0,
hostname=0x810781c "a:mx.colt.net,ns:mx.colt.net", banner=0x8106f60)
at dns.c:362
#15 0x08057bf7 in run_nslookup_service (service=0x0) at bbtest-net.c:970
---Type <return> to continue, or q <return> to quit---
#16 0x0805abf9 in main (argc=5, argv=0x804675c) at bbtest-net.c:2218
(gdb)
|\/|artin
▸
-----Original Message----- From: Ward, Martin [mailto:user-2d33a6eb6a05@xymon.invalid] Sent: 09 January 2009 10:35 To: user-ae9b8668bcde@xymon.invalid Subject: RE: [hobbit] Two DNS lookups for a server but one fails Johan, The "purple problems" are a different view of this same error. Having experienced both I can say that you either get bb-net to crash or you find that all your remote connection tests (where bbtest-net verifies if a remote port is accessible) turn purple. Possibly you may get both at the same time, I haven't noticed that yet. I can confirm that using the new dns.c code bbtest-net crashes on my Solaris 10 system whether I use the --no-ares option or not. 8-( |\/|artin-----Original Message----- From: Johan Sjöberg [mailto:user-74c177c1220d@xymon.invalid] Sent: 09 January 2009 08:49 To: user-ae9b8668bcde@xymon.invalid Subject: RE: [hobbit] Two DNS lookups for a server but one failsWe did not experience any "purple" problems, only this single crash of the bbtest-net. But I suppose that several > consecutive crashes would have caused the tests to go purple./Johan -----Original Message-----From: Everett, Vernon [mailto:user-9da1a1882f49@xymon.invalid] Sent: den 9 januari 2009 07:44 To: user-ae9b8668bcde@xymon.invalid Subject: RE: [hobbit] Two DNS lookups for a server but one failsThis looks remarkably like the error we were experiencing. What are > you running on? Do you have the case where all the network tests - ping, http, https, ftp etc. go purple? If so it might be the same. With Henrik's assistance, we resolved it down to a problem with the ARES resolver. I changed our hobbitlaunch.cfg file > to include the --no-ares option (see below), and we have > never had the issue again. [bbnet]ENVFILE /usr/lib/hobbit/server/etc/hobbitserver.cfg NEEDS hobbitd CMD bbtest-net --report --ping --checkresponse --no-ares LOGFILE $BBSERVERLOGS/bb-network.log INTERVAL 5m > > (Of course this does come with a caveat. See bbtest-net man page in Xymon docs > http://www.xymon.com/hobbit/help/manpages/man1/bbtest-net.1.html )YMMV. CheersV-----Original Message-----From: user-d54077869176@xymon.invalid [mailto:user-d54077869176@xymon.invalid] Sent: Friday, 9 January 2009 3:34 PM To: user-ae9b8668bcde@xymon.invalid Subject: Re: [hobbit] Two DNS lookups for a server but one failsOn Thu, 8 Jan 2009 09:24:29 +0100, "Lars Ebeling"<user-1fecd3eafd52@xymon.invalid> wrote:----- Original Message ----- From: "Johan Sjöberg" <user-74c177c1220d@xymon.invalid> To: <user-ae9b8668bcde@xymon.invalid> Sent: Thursday, January 08, 2009 9:04 AM Subject: RE: [hobbit] Two DNS lookups for a server but one failsThis night, after installing the new bbtest-net, we received an alarm on bbtest for the Xymon server, saying " - Program crashed Fatal signal caught!"From hobbitlaunch.log: "2009-01-08 05:05:07 Task bbnet terminated by > >>signal 6"/JohanIf the program crash, you should have a coredump. Run gdb on the > > coredump and post the result here, for Henrik to look at.--Regards Lars Ebelinghttp://leopg9.no-ip.orgHobbithobbyist"I am not young enough to know everything."-- Oscar WildeHi,Same issue on my side, i used the new compiled bbtest-net using corrected dns.c All is OK, but i got same issue with > same error message.NOTICE: This email and any attachments are confidential.They may contain legally privileged information or > copyright material. You must not read, copy, use or > disclose them without authorisation. If you are not an > intended recipient, please contact us at once by return > email and then delete both messages and all attachments.************************************************************** *********************** The message is intended for the named addressee only and may not be disclosed to or used by anyone else, nor may it be copied in any way. The contents of this message and its attachments are confidential and may also be subject to legal privilege. If you are not the named addressee and/or have received this message in error, please advise us by e-mailing user-61c7f445d564@xymon.invalid and delete the message and any attachments without retaining any copies. Internet communications are not secure and COLT does not accept responsibility for this message, its contents nor responsibility for any viruses. No contracts can be created or varied on behalf of COLT Telecommunications, its subsidiaries or affiliates ("COLT") and any other party by email Communications unless expressly agreed in writing with such other party. Please note that incoming emails will be automatically scanned to eliminate potential viruses and unsolicited promotional emails. For more information refer to www.colt.net or contact us on +44(0)20 7390 3900.
*************************************************************************************
The message is intended for the named addressee only and may not be disclosed to or used by anyone else, nor may it be copied in any way.
The contents of this message and its attachments are confidential and may also be subject to legal privilege. If you are not the named addressee and/or have received this message in error, please advise us by e-mailing user-61c7f445d564@xymon.invalid and delete the message and any attachments without retaining any copies.
Internet communications are not secure and COLT does not accept responsibility for this message, its contents nor responsibility for any viruses.
No contracts can be created or varied on behalf of COLT Telecommunications, its subsidiaries or affiliates ("COLT") and any other party by email Communications unless expressly agreed in writing with such other party.
Please note that incoming emails will be automatically scanned to eliminate potential viruses and unsolicited promotional emails. For more information refer to www.colt.net or contact us on +44(0)20 7390 3900.
list Martin Ward
Well I have poked and prodded at the code for the last few hours but to no avail. I can't actually see why strbuf_addtobuffer() is shown as having rubbish passed to it in the newtext variable when it is being called from dns2.c with a static text string: "Undocumented ARES return code\n" Also I am unsure why it's even reaching this part of the code when I specified --no-ares on the command line. Any ideas?
▸
|\/|artin
-----Original Message----- From: Ward, Martin [mailto:user-2d33a6eb6a05@xymon.invalid] Sent: 09 January 2009 11:48 To: user-ae9b8668bcde@xymon.invalid Subject: RE: [hobbit] Two DNS lookups for a server but one fails OK, I found a core file and have gleaned the following (I removed the symbol load messages): Core was generated by `bbtest-net --report --ping --checkresponse --no-ares'. Program terminated with signal 6, Aborted. #0 0xfec64a27 in _lwp_kill () from /lib/libc.so.1 (gdb) bt #0 0xfec64a27 in _lwp_kill () from /lib/libc.so.1 #1 0xfec621d4 in thr_kill () from /lib/libc.so.1 #2 0xfec111c7 in raise () from /lib/libc.so.1 #3 0xfebf15d9 in abort () from /lib/libc.so.1 #4 0x0806cbea in sigsegv_handler (signum=11) at sig.c:52 #5 0xfec63def in __sighndlr () from /lib/libc.so.1 #6 0xfec5a292 in call_user_handler () from /lib/libc.so.1 #7 <signal handler called> #8 0x0806db51 in strbuf_addtobuffer (buf=0x10, newtext=0x8106960 "@\f\a\bÀX\022\b\b", newlen=135419976) at strfunc.c:100 #9 0x08061783 in dns_detail_callback (arg=0x812ee90, status=16, abuf=0x0, alen=0) at dns2.c:215 #10 0x08070ba0 in end_squery (squery=0x81258c0, status=134700378, abuf=0x0, alen=0) at ares_search.c:185 #11 0x08070c6f in search_callback (arg=0x81258c0, status=134700378, abuf=0x0, alen=0) at ares_search.c:179 #12 0x08072fac in qcallback (arg=0x8106960, status=16, abuf=0x0, alen=0) at ares_query.c:110 #13 0x080728b4 in ares_destroy (channel=0x8125848) at ares_destroy.c:40 #14 0x08060ff3 in dns_test_server (serverip=0x0, hostname=0x810781c "a:mx.colt.net,ns:mx.colt.net", banner=0x8106f60) at dns.c:362 #15 0x08057bf7 in run_nslookup_service (service=0x0) at bbtest-net.c:970 ---Type <return> to continue, or q <return> to quit--- #16 0x0805abf9 in main (argc=5, argv=0x804675c) at bbtest-net.c:2218 (gdb) |\/|artin-----Original Message----- From: Ward, Martin [mailto:user-2d33a6eb6a05@xymon.invalid] Sent: 09 January 2009 10:35 To: user-ae9b8668bcde@xymon.invalid Subject: RE: [hobbit] Two DNS lookups for a server but one failsJohan,The "purple problems" are a different view of this same error. Having experienced both I can say that you either get > bb-net to crash or you find that all your remote connection > tests (where bbtest-net verifies if a remote port is > accessible) turn purple. Possibly you may get both at the > same time, I haven't noticed that yet. I can confirm that using the new dns.c code bbtest-net crashes on my Solaris 10 system whether I use the --no-ares > option or not. 8-( |\/|artin-----Original Message-----From: Johan Sjöberg [mailto:user-74c177c1220d@xymon.invalid] Sent: 09 January 2009 08:49 To: user-ae9b8668bcde@xymon.invalid Subject: RE: [hobbit] Two DNS lookups for a server but one failsWe did not experience any "purple" problems, only this single crash > > of the bbtest-net. But I suppose that several consecutive crashes > > would have caused the tests to go purple./Johan -----Original Message-----From: Everett, Vernon [mailto:user-9da1a1882f49@xymon.invalid] Sent: den 9 januari 2009 07:44 To: user-ae9b8668bcde@xymon.invalid Subject: RE: [hobbit] Two DNS lookups for a server but one failsThis looks remarkably like the error we were experiencing. What are you running on? Do you have the case where all the network tests - ping, http, > > https, ftp etc. go purple? If so it might be the same. With Henrik's assistance, we resolved it down to a problem with the > > ARES resolver. I changed our hobbitlaunch.cfg file to include the > > --no-ares option (see below), and we have never had the issue again. > > [bbnet]ENVFILE /usr/lib/hobbit/server/etc/hobbitserver.cfg NEEDS hobbitd CMD bbtest-net --report --ping --checkresponse --no-ares LOGFILE $BBSERVERLOGS/bb-network.log INTERVAL 5m(Of course this does come with a caveat. See bbtest-net man page in > > Xymon docs > > http://www.xymon.com/hobbit/help/manpages/man1/bbtest-net.1.html ) YMMV. CheersV-----Original Message-----From: user-d54077869176@xymon.invalid [mailto:user-d54077869176@xymon.invalid] Sent: Friday, 9 January 2009 3:34 PM To: user-ae9b8668bcde@xymon.invalid Subject: Re: [hobbit] Two DNS lookups for a server but one failsOn Thu, 8 Jan 2009 09:24:29 +0100, "Lars Ebeling" > > <user-1fecd3eafd52@xymon.invalid> wrote:----- Original Message ----- From: "Johan Sjöberg" <user-74c177c1220d@xymon.invalid> To: <user-ae9b8668bcde@xymon.invalid> Sent: Thursday, January 08, 2009 9:04 AM Subject: RE: [hobbit] Two DNS lookups for a server but one failsThis night, after installing the new bbtest-net, we received an alarm on bbtest for the Xymon server, saying " - Program crashed Fatal > > > signal caught!"From hobbitlaunch.log: "2009-01-08 05:05:07 Task bbnet terminated by signal 6"/JohanIf the program crash, you should have a coredump. Run gdb on the coredump and post the result here, for Henrik to look at.--Regards Lars Ebelinghttp://leopg9.no-ip.orgHobbithobbyist"I am not young enough to know everything."-- Oscar WildeHi,Same issue on my side, i used the new compiled bbtest-net using > > corrected dns.c All is OK, but i got same issue with same error > > message.NOTICE: This email and any attachments are confidential. They may > > contain legally privileged information or copyright material. You > > must not read, copy, use or disclose them without authorisation. If > > you are not an intended recipient, please contact us at once by > > return email and then delete both messages and all attachments.************************************************************************************* The message is intended for the named addressee only and may not be disclosed to or used by anyone else, nor may it be > copied in any way. > > The contents of this message and its attachments are confidential and may also be subject to legal privilege. If > you are not the named addressee and/or have received this > message in error, please advise us by e-mailing > user-61c7f445d564@xymon.invalid and delete the message and any attachments > without retaining any copies. > > Internet communications are not secure and COLT does not accept responsibility for this message, its contents nor > responsibility for any viruses. > > No contracts can be created or varied on behalf of COLT Telecommunications, its subsidiaries or affiliates ("COLT") > and any other party by email Communications unless expressly > agreed in writing with such other party. > > Please note that incoming emails will be automatically scanned to eliminate potential viruses and unsolicited > promotional emails. For more information refer to www.colt.net or contact us on +44(0)20 7390 3900.************************************************************** *********************** The message is intended for the named addressee only and may not be disclosed to or used by anyone else, nor may it be copied in any way. The contents of this message and its attachments are confidential and may also be subject to legal privilege. If you are not the named addressee and/or have received this message in error, please advise us by e-mailing user-61c7f445d564@xymon.invalid and delete the message and any attachments without retaining any copies. Internet communications are not secure and COLT does not accept responsibility for this message, its contents nor responsibility for any viruses. No contracts can be created or varied on behalf of COLT Telecommunications, its subsidiaries or affiliates ("COLT") and any other party by email Communications unless expressly agreed in writing with such other party. Please note that incoming emails will be automatically scanned to eliminate potential viruses and unsolicited promotional emails. For more information refer to www.colt.net or contact us on +44(0)20 7390 3900.
*************************************************************************************
The message is intended for the named addressee only and may not be disclosed to or used by anyone else, nor may it be copied in any way.
The contents of this message and its attachments are confidential and may also be subject to legal privilege. If you are not the named addressee and/or have received this message in error, please advise us by e-mailing user-61c7f445d564@xymon.invalid and delete the message and any attachments without retaining any copies.
Internet communications are not secure and COLT does not accept responsibility for this message, its contents nor responsibility for any viruses.
No contracts can be created or varied on behalf of COLT Telecommunications, its subsidiaries or affiliates ("COLT") and any other party by email Communications unless expressly agreed in writing with such other party.
Please note that incoming emails will be automatically scanned to eliminate potential viruses and unsolicited promotional emails. For more information refer to www.colt.net or contact us on +44(0)20 7390 3900.
list Henrik Størner
OK, it seems that the bugfix I added in Xymon 4.2.2 had some nasty side-effects. I have started work on a 4.2.3 maintenance tree in Subversion, and there is a new version of the DNS code in it currently. http://hobbitmon.svn.sourceforge.net/viewvc/hobbitmon/branches/4.2.3/ You can try out this code (download it from the link above), but it also involves a change from C-ARES 1.2.1 -> 1.6.0, so you will have to re-run the configure script to perhaps pick up a new runtime library that the new C-ARES requires (librt). I ran it for most of Friday afternoon at work with no obvious bad effects, so I hope it will work better than the current code. In <user-b8adb59d5e00@xymon.invalid> "Ward, Martin" <user-2d33a6eb6a05@xymon.invalid> writes:
Also I am unsure why it's even reaching this part of the code when I specif= ied --no-ares on the command line.
Xymon still uses ARES to perform the "dns" tests for specific hosts; the standard resolver library does not allow you to specify what DNS server to query. So the --no-ares option only has effect on the DNS lookups Hobbit performs to determine the IP of the hosts it is testing, it does not affect the specific testing of a DNS server. Regards, Henrik
list Martin Ward
Thanks for that Henrik. I took a copy of my existing 4.2.2 code, overwrote the bbnet directory with the SVN source and recreated the c-ares subdirectory in the bbnet subdir by untar-ing c-areas.1.6.0.tar.gz and renaming the c-ares.1.6.0/ subdirectory to c-ares/ Having done this the compilation failed as it couldn't find a library for the clock_gettime() function. I found that adding "-lrt" to the "LDAPLIBS" variable in the Makefile in the top level source directory solved the problem, but this may not be the right place to put it. A little more fiddling was required because my SSL libraries are not located in the standard library locations and I hate having to set up LD_LIBRARY_PATH everywhere, so my LDAPLIBS and SSLLIBS ended up looking like: SSLLIBS = -L/usr/local/ssl/lib -R/usr/local/ssl/lib -lssl -lcrypto LDAPLIBS = -L/usr/lib -lldap -lrt After compiling it successfully I copied bbtest-net into the server/bin directory and restarted xymon. Apart from a timeout issue with a local script, which I believe is a config issue on my part, things look like they are working fine now. The double DNS lookups are working OK and I am getting remote port connection tests working so that's good or me. Thanks Henrik!
▸
|\/|artin
-----Original Message----- From: Henrik "Størner [mailto:user-ce4a2c883f75@xymon.invalid] Sent: 12 January 2009 11:50 To: user-ae9b8668bcde@xymon.invalid Subject: Re: [hobbit] Two DNS lookups for a server but one fails OK, it seems that the bugfix I added in Xymon 4.2.2 had some nasty side-effects. I have started work on a 4.2.3 maintenance tree in Subversion, and there is a new version of the DNS code in it currently. http://hobbitmon.svn.sourceforge.net/viewvc/hobbitmon/branches/4.2.3/ You can try out this code (download it from the link above), but it also involves a change from C-ARES 1.2.1 -> 1.6.0, so you will have to re-run the configure script to perhaps pick up a new runtime library that the new C-ARES requires (librt). I ran it for most of Friday afternoon at work with no obvious bad effects, so I hope it will work better than the current code.
In <user-b6dd35267769@xymon.invalid
▸
LT> "Ward, Martin" <user-2d33a6eb6a05@xymon.invalid> writes:Also I am unsure why it's even reaching this part of the code when I >specif= ied --no-ares on the command line.Xymon still uses ARES to perform the "dns" tests for specific hosts; the standard resolver library does not allow you to specify what DNS server to query. So the --no-ares option only has effect on the DNS lookups Hobbit performs to determine the IP of the hosts it is testing, it does not affect the specific testing of a DNS server. Regards, Henrik
*************************************************************************************
The message is intended for the named addressee only and may not be disclosed to or used by anyone else, nor may it be copied in any way.
The contents of this message and its attachments are confidential and may also be subject to legal privilege. If you are not the named addressee and/or have received this message in error, please advise us by e-mailing user-61c7f445d564@xymon.invalid and delete the message and any attachments without retaining any copies.
Internet communications are not secure and COLT does not accept responsibility for this message, its contents nor responsibility for any viruses.
No contracts can be created or varied on behalf of COLT Telecommunications, its subsidiaries or affiliates ("COLT") and any other party by email Communications unless expressly agreed in writing with such other party.
Please note that incoming emails will be automatically scanned to eliminate potential viruses and unsolicited promotional emails. For more information refer to www.colt.net or contact us on +44(0)20 7390 3900.
list Johan Sjöberg
Is it possible to use the "badTEST" syntax for DNS tests? When I try it, the DNS test stops updating in Xymon. /Johan
list Henrik Størner
▸
In <user-4d6ca4e7279b@xymon.invalid> =?iso-8859-1?Q?Johan_Sj=F6berg?= <user-74c177c1220d@xymon.invalid> writes:
Is it possible to use the "badTEST" syntax for DNS tests? When I try it, = the DNS test stops updating in Xymon.
My immediate response would be "yes", but I haven't checked the code. Should work, though. What does your bb-hosts entry look like ? Regards, Henrik
list Johan Sjöberg
It looked like this during the tests: 10.225.72.5 xxx.xxx.xxx # baddns:1:2:4 devmon:model(compaq;server) With this entry, the dns test stopped updating. /Johan
▸
-----Original Message-----
From: Henrik "Størner [mailto:user-ce4a2c883f75@xymon.invalid]
Sent: den 22 januari 2009 15:31
To: user-ae9b8668bcde@xymon.invalid
Subject: Re: [hobbit] Two DNS lookups for a server but one fails
In <user-4d6ca4e7279b@xymon.invalid> =?iso-8859-1?Q?Johan_Sj=F6berg?= <user-74c177c1220d@xymon.invalid> writes:
Is it possible to use the "badTEST" syntax for DNS tests? When I try it, = the DNS test stops updating in Xymon.
My immediate response would be "yes", but I haven't checked the code. Should work, though. What does your bb-hosts entry look like ? Regards, Henrik
list Henrik Størner
In <user-ec5476262788@xymon.invalid> =?iso-8859-1?Q?Johan_Sj=F6berg?= <user-74c177c1220d@xymon.invalid> writes:
It looked like this during the tests: 10.225.72.5 xxx.xxx.xxx # baddns:1:2:4 devmon:model(compaq;server)
Ah I see. The "baddns" only tells how to handle failures of the DNS check. You still need the "dns" to enable DNS checking at all! So your entry should have been 10.225.72.5 xx.xxx.xxx # baddns:1:2:4 dns devmon:model(compaq;server) Regards, Henrik
list Johan Sjöberg
Ah, thanks, I will try that.
▸
/Johan
-----Original Message-----
From: Henrik "Størner [mailto:user-ce4a2c883f75@xymon.invalid]
Sent: den 22 januari 2009 16:02
To: user-ae9b8668bcde@xymon.invalid
Subject: Re: [hobbit] Two DNS lookups for a server but one fails
In <user-ec5476262788@xymon.invalid> =?iso-8859-1?Q?Johan_Sj=F6berg?= <user-74c177c1220d@xymon.invalid> writes:
It looked like this during the tests: 10.225.72.5 xxx.xxx.xxx # baddns:1:2:4 devmon:model(compaq;server)
Ah I see. The "baddns" only tells how to handle failures of the DNS check. You still need the "dns" to enable DNS checking at all! So your entry should have been 10.225.72.5 xx.xxx.xxx # baddns:1:2:4 dns devmon:model(compaq;server) Regards, Henrik