Xymon Mailing List Archive search

Xymond and xymonnet quirks

16 messages in this thread

list Ben Poppy · Wed, 17 Aug 2011 23:41:41 -0500 ·
After getting the latest Xymon 4.3.4 finally working on CentOS 6 64-bit, I'm seeing a few quirks that I'd appreciate some input on.

First, under Xymonnet tests, it's alerting Yellow because the time spent is too long (over 300). The culprit is DNS tests executed taking over 340 itself.. Can anyone shed some light on this? I know I saw some other threads about putting in a --no-ares but I wanted to see if there was something else I might be missing before I change the functionality of the tests..

Next, under Xymond, I'm getting a ton of multiple statuses reports for CONN tests. I can't seem to find a clear archive of what would cause that other than having both IP's listed when you have 2 display servers..

Both servers reference each other as a client and in xymonserver.cfg..

Thanks for any help,
-Ben

The contents of this message may contain private, protected and/or privileged information.  If you received this message in error, you should destroy the e-mail message and any attachments or copies, and you are prohibited from retaining, distributing, disclosing or using any information contained within.  Please contact the sender and advise of the erroneous delivery by return e-mail or telephone.  Thank you for your cooperation.
list Henrik Størner · Thu, 18 Aug 2011 07:40:00 +0200 ·
quoted from Ben Poppy
On 18-08-2011 06:41, Poppy, Ben wrote:
After getting the latest Xymon 4.3.4 finally working on CentOS 6 64-bit,
I'm seeing a few quirks that I'd appreciate some input on.

First, under Xymonnet tests, it's alerting Yellow because the time spent
is too long (over 300). The culprit is DNS tests executed taking over
340 itself.. Can anyone shed some light on this?
Is it "DNS lookups" or "DNS tests" ?

I've had another report that there is a problem with timeout handling in the DNS *tests* (the explicit tests to see if a dns server is working), which would show up in the latter line, whereas the ordinary DNS *lookups* (those performed to see what IP we should use to probe all the hosts in hosts.cfg) work OK.
quoted from Ben Poppy
Next, under Xymond, I'm getting a ton of multiple statuses reports for
CONN tests. I can't seem to find a clear archive of what would cause
that other than having both IP's listed when you have 2 display servers..

Both servers reference each other as a client and in xymonserver.cfg..
No wonder you get that warning, then. With both servers listing each other in xymonserver.cfg, then the network tests you run on each server will send the results to both of your servers - so server A will be getting all of the "conn" status reports both from server A and server B. Hence the warning - there are two systems reporting the same data.


Regards,
Henrik
list Ben Poppy · Thu, 18 Aug 2011 14:48:19 -0500 ·
Yes, you are correct, DNS Tests:

TIME SPENT
Event                                           Start time          Duration
xymonnet startup                              56982.063059                 -
Service definitions loaded                    56982.063799          0.000739 Tests loaded                                  56982.112916          0.049116 DNS lookups completed                         56987.113241          5.000324 Test engine setup completed                   56987.121448          0.008207 TCP tests completed                           56999.227356         12.105907 PING test completed (1088 hosts)              57034.610220         35.382864 PING test results sent                        57034.630133          0.019912 Test result collection completed              57034.630481          0.000347 LDAP test engine setup completed              57034.630482          0.000000 LDAP tests executed                           57034.630482          0.000000 LDAP tests result collection completed        57034.630482          0.000000 DNS tests executed                            57378.220329        343.589846 Test results transmitted                      57378.266216          0.045887 xymonnet completed                            57378.267156          0.000939 TIME TOTAL                                                        396.204096 

For the xymonserver.cfg, what is the proper way to configure when you have 2 xymon servers (1 primary and 1 instant backup server that gets the same data from all the reporting clients)? Should I change it from multiple to single in the xymonserver.cfg? but still have both listed in the xymonclient.cfg?

Thanks,
-Ben
quoted from Henrik Størner

-----Original Message-----
From: xymon-bounces at xymon.com [mailto:xymon-bounces at xymon.com] On Behalf Of Henrik Størner
Sent: Thursday, August 18, 2011 12:40 AM
To: xymon at xymon.com
Subject: Re: [Xymon] Xymond and xymonnet quirks

On 18-08-2011 06:41, Poppy, Ben wrote:
After getting the latest Xymon 4.3.4 finally working on CentOS 6 64-bit,
I'm seeing a few quirks that I'd appreciate some input on.

First, under Xymonnet tests, it's alerting Yellow because the time spent
is too long (over 300). The culprit is DNS tests executed taking over
340 itself.. Can anyone shed some light on this?
Is it "DNS lookups" or "DNS tests" ?

I've had another report that there is a problem with timeout handling in the DNS *tests* (the explicit tests to see if a dns server is working), which would show up in the latter line, whereas the ordinary DNS *lookups* (those performed to see what IP we should use to probe all the hosts in hosts.cfg) work OK.
Next, under Xymond, I'm getting a ton of multiple statuses reports for
CONN tests. I can't seem to find a clear archive of what would cause
that other than having both IP's listed when you have 2 display servers..

Both servers reference each other as a client and in xymonserver.cfg..
No wonder you get that warning, then. With both servers listing each other in xymonserver.cfg, then the network tests you run on each server will send the results to both of your servers - so server A will be getting all of the "conn" status reports both from server A and server B. Hence the warning - there are two systems reporting the same data.


Regards,
Henrik


The contents of this message may contain private, protected and/or privileged information.  If you received this message in error, you should destroy the e-mail message and any attachments or copies, and you are prohibited from retaining, distributing, disclosing or using any information contained within.  Please contact the sender and advise of the erroneous delivery by return e-mail or telephone.  Thank you for your cooperation.
list Henrik Størner · Thu, 18 Aug 2011 21:54:21 +0200 ·
For the xymonserver.cfg, what is the proper way to configure when you have 2 xymon servers (1 primary and 1 instant backup server that gets the same data from all the reporting clients)? Should I change it from multiple to single in the xymonserver.cfg? but still have both listed in the xymonclient.cfg?
Spot on - that is how it should be done. A Xymon server really should 
only be reporting to itself, not to other Xymon servers.


Regards,
Henrik
list Ben Poppy · Thu, 18 Aug 2011 14:58:08 -0500 ·
Will do. Any ideas on the DNS Tests item?
quoted from Henrik Størner

-----Original Message-----
From: Henrik Størner [mailto:user-ce4a2c883f75@xymon.invalid] 
Sent: Thursday, August 18, 2011 2:54 PM
To: Poppy, Ben
Cc: xymon at xymon.com
Subject: Re: [Xymon] Xymond and xymonnet quirks

For the xymonserver.cfg, what is the proper way to configure when you have 2 xymon servers (1 primary and 1 instant backup server that gets the same data from all the reporting clients)? Should I change it from multiple to single in the xymonserver.cfg? but still have both listed in the xymonclient.cfg?
Spot on - that is how it should be done. A Xymon server really should 
only be reporting to itself, not to other Xymon servers.


Regards,
Henrik

The contents of this message may contain private, protected and/or privileged information.  If you received this message in error, you should destroy the e-mail message and any attachments or copies, and you are prohibited from retaining, distributing, disclosing or using any information contained within.  Please contact the sender and advise of the erroneous delivery by return e-mail or telephone.  Thank you for your cooperation.
list Henrik Størner · Thu, 18 Aug 2011 21:59:32 +0200 ·
On 18-08-2011 21:58, Poppy, Ben wrote:
Will do. Any ideas on the DNS Tests item?
I'm looking into it. Not sure yet if it is caused by Xymon, or by the 
upgrade of the C-ARES library (which is used for all of the DNS stuff).


Regards,
Henrik
list Bruce White · Thu, 18 Aug 2011 21:51:14 -0500 ·
I have written scripts to check the status of tests on the first Xymon that is run from the second Xymon server and reported on the second xymon server.  I that way I have visibility to both to status of both Xymon servers on a single Xymon server.

   ....Bruce

    
 
 Bruce White
 Senior Enterprise Systems Engineer | Phone: X-XXX-XXX-XXXX | Fax: XXX-XXX-XXXX | user-58f975e8bf9d@xymon.invalid | http://www.fellowes.com/
 
 
 
Disclaimer: The information contained in this message may be privileged and confidential and protected from disclosure. If the reader of this message is not the intended recipient or an employee or agent responsible for delivering this message to the intended recipient, you are hereby notified that any dissemination, distribution or copying of this communication is strictly prohibited. If you have received this communication in error, please notify us immediately by replying to the message and deleting it from your computer. Thank you. Fellowes, Inc.
quoted from Henrik Størner
 
-----Original Message-----
From: xymon-bounces at xymon.com [mailto:xymon-bounces at xymon.com] On Behalf Of Henrik Størner
Sent: Thursday, August 18, 2011 2:54 PM
To: Poppy, Ben
Cc: xymon at xymon.com
Subject: Re: [Xymon] Xymond and xymonnet quirks

For the xymonserver.cfg, what is the proper way to configure when you have 2 xymon servers (1 primary and 1 instant backup server that gets the same data from all the reporting clients)? Should I change it from multiple to single in the xymonserver.cfg? but still have both listed in the xymonclient.cfg?
Spot on - that is how it should be done. A Xymon server really should 
only be reporting to itself, not to other Xymon servers.


Regards,
Henrik
list Bruce White · Thu, 18 Aug 2011 22:00:02 -0500 ·
Henrik,

This is not new.  Here is data from an April, 2010 run on a Xymon 4.3.0-0.beta2 server.  The issue stayed that way for over an hour and with no changes to DNS or Xymon, it just went away on its own.

TIME SPENT
Event                                            Starttime          Duration
bbtest-net startup                          6140121.488751                 -
Service definitions loaded                  6140121.489712          0.000960 Tests loaded                                6140121.519901          0.030188 DNS lookups completed                       6140131.514514          9.994613 Test engine setup completed                 6140131.519790          0.005275 TCP tests completed                         6140131.930649          0.410859 PING test completed (535 hosts)             6140161.657584         29.726934 PING test results sent                      6140161.659809          0.002224 Test result collection completed            6140161.660014          0.000205 LDAP test engine setup completed            6140161.660015          0.000000 LDAP tests executed                         6140161.660016          0.000001 LDAP tests result collection completed      6140161.660017          0.000000 DNS tests executed                          6140611.940605        450.280588
Test results transmitted                    6140611.948177          0.007571 bbtest-net completed                        6140611.949255          0.001078 TIME TOTAL                                                        490.460504

Hope that helps,
signature
Bruce


 
 Bruce White
 Senior Enterprise Systems Engineer | Phone: X-XXX-XXX-XXXX | Fax: XXX-XXX-XXXX | user-58f975e8bf9d@xymon.invalid | http://www.fellowes.com/
 
 
 
Disclaimer: The information contained in this message may be privileged and confidential and protected from disclosure. If the reader of this message is not the intended recipient or an employee or agent responsible for delivering this message to the intended recipient, you are hereby notified that any dissemination, distribution or copying of this communication is strictly prohibited. If you have received this communication in error, please notify us immediately by replying to the message and deleting it from your computer. Thank you. Fellowes, Inc.
 
-----Original Message-----

quoted from Ben Poppy
From: xymon-bounces at xymon.com [mailto:xymon-bounces at xymon.com] On Behalf Of Poppy, Ben
Sent: Thursday, August 18, 2011 2:48 PM
To: Henrik Størner; xymon at xymon.com
Subject: Re: [Xymon] Xymond and xymonnet quirks

Yes, you are correct, DNS Tests:

TIME SPENT
Event                                           Start time          Duration
xymonnet startup                              56982.063059                 -
Service definitions loaded                    56982.063799          0.000739 Tests loaded                                  56982.112916          0.049116 DNS lookups completed                         56987.113241          5.000324 Test engine setup completed                   56987.121448          0.008207 TCP tests completed                           56999.227356         12.105907 PING test completed (1088 hosts)              57034.610220         35.382864 PING test results sent                        57034.630133          0.019912 Test result collection completed              57034.630481          0.000347 LDAP test engine setup completed              57034.630482          0.000000 LDAP tests executed                           57034.630482          0.000000 LDAP tests result collection completed        57034.630482          0.000000 DNS tests executed                            57378.220329        343.589846 Test results transmitted                      57378.266216          0.045887 xymonnet completed                            57378.267156          0.000939 TIME TOTAL                                                        396.204096 

For the xymonserver.cfg, what is the proper way to configure when you have 2 xymon servers (1 primary and 1 instant backup server that gets the same data from all the reporting clients)? Should I change it from multiple to single in the xymonserver.cfg? but still have both listed in the xymonclient.cfg?

Thanks,
-Ben

-----Original Message-----
From: xymon-bounces at xymon.com [mailto:xymon-bounces at xymon.com] On Behalf Of Henrik Størner
Sent: Thursday, August 18, 2011 12:40 AM
To: xymon at xymon.com
Subject: Re: [Xymon] Xymond and xymonnet quirks

On 18-08-2011 06:41, Poppy, Ben wrote:
After getting the latest Xymon 4.3.4 finally working on CentOS 6 64-bit,
I'm seeing a few quirks that I'd appreciate some input on.

First, under Xymonnet tests, it's alerting Yellow because the time spent
is too long (over 300). The culprit is DNS tests executed taking over
340 itself.. Can anyone shed some light on this?
Is it "DNS lookups" or "DNS tests" ?

I've had another report that there is a problem with timeout handling in the DNS *tests* (the explicit tests to see if a dns server is working), which would show up in the latter line, whereas the ordinary DNS *lookups* (those performed to see what IP we should use to probe all the hosts in hosts.cfg) work OK.
Next, under Xymond, I'm getting a ton of multiple statuses reports for
CONN tests. I can't seem to find a clear archive of what would cause
that other than having both IP's listed when you have 2 display servers..

Both servers reference each other as a client and in xymonserver.cfg..
No wonder you get that warning, then. With both servers listing each other in xymonserver.cfg, then the network tests you run on each server will send the results to both of your servers - so server A will be getting all of the "conn" status reports both from server A and server B. Hence the warning - there are two systems reporting the same data.


Regards,
Henrik


The contents of this message may contain private, protected and/or privileged information.  If you received this message in error, you should destroy the e-mail message and any attachments or copies, and you are prohibited from retaining, distributing, disclosing or using any information contained within.  Please contact the sender and advise of the erroneous delivery by return e-mail or telephone.  Thank you for your cooperation.
list Henrik Størner · Fri, 19 Aug 2011 13:36:51 +0200 ·
quoted from Henrik Størner
On Thu, 18 Aug 2011 21:59:32 +0200, Henrik Størner <user-ce4a2c883f75@xymon.invalid> wrote:
On 18-08-2011 21:58, Poppy, Ben wrote:
Will do. Any ideas on the DNS Tests item?
I'm looking into it. Not sure yet if it is caused by Xymon, or by the 
upgrade of the C-ARES library (which is used for all of the DNS stuff).
Think I got it now. Would You mind trying out this patch for me ? To
apply:

  cd xymon-4.3.4
  patch -p0 < /tmp/dns-timeout.patch
  make
  cp -p ~xymon/server/bin/xymonnet ~xymon/server/bin/xymonnet.orig
  cp xymonnet/xymonnet ~xymon/server/bin/

This should enforce the normal timeout-setting for DNS tests also.


Regards,
Henrik
list Ben Poppy · Fri, 19 Aug 2011 14:04:54 -0500 ·
I got this error:

Error output:
Odd ... pending_dns_count=704 after a queue run

The total time for dns tests dropped from over 300 to ~80 now..
quoted from Henrik Størner

-----Original Message-----
From: user-ce4a2c883f75@xymon.invalid [mailto:user-ce4a2c883f75@xymon.invalid] 
Sent: Friday, August 19, 2011 6:37 AM
To: xymon at xymon.com
Cc: Poppy, Ben
Subject: Re: [Xymon] Xymond and xymonnet quirks

On Thu, 18 Aug 2011 21:59:32 +0200, Henrik Størner <user-ce4a2c883f75@xymon.invalid> wrote:
On 18-08-2011 21:58, Poppy, Ben wrote:
Will do. Any ideas on the DNS Tests item?
I'm looking into it. Not sure yet if it is caused by Xymon, or by the 
upgrade of the C-ARES library (which is used for all of the DNS stuff).
Think I got it now. Would You mind trying out this patch for me ? To
apply:

  cd xymon-4.3.4
  patch -p0 < /tmp/dns-timeout.patch
  make
  cp -p ~xymon/server/bin/xymonnet ~xymon/server/bin/xymonnet.orig
  cp xymonnet/xymonnet ~xymon/server/bin/

This should enforce the normal timeout-setting for DNS tests also.


Regards,
Henrik

The contents of this message may contain private, protected and/or privileged information.  If you received this message in error, you should destroy the e-mail message and any attachments or copies, and you are prohibited from retaining, distributing, disclosing or using any information contained within.  Please contact the sender and advise of the erroneous delivery by return e-mail or telephone.  Thank you for your cooperation.
list Ben Poppy · Fri, 19 Aug 2011 14:15:30 -0500 ·
And all sorts of other errors as well in CONN and HTTP tests..

I'm reverting back to original xymonnet
quoted from Ben Poppy

-----Original Message-----
From: Poppy, Ben 
Sent: Friday, August 19, 2011 2:05 PM
To: 'user-ce4a2c883f75@xymon.invalid'; xymon at xymon.com
Subject: RE: [Xymon] Xymond and xymonnet quirks

I got this error:

Error output:
Odd ... pending_dns_count=704 after a queue run

The total time for dns tests dropped from over 300 to ~80 now..

-----Original Message-----
From: user-ce4a2c883f75@xymon.invalid [mailto:user-ce4a2c883f75@xymon.invalid] 
Sent: Friday, August 19, 2011 6:37 AM
To: xymon at xymon.com
Cc: Poppy, Ben
Subject: Re: [Xymon] Xymond and xymonnet quirks

On Thu, 18 Aug 2011 21:59:32 +0200, Henrik Størner <user-ce4a2c883f75@xymon.invalid> wrote:
On 18-08-2011 21:58, Poppy, Ben wrote:
Will do. Any ideas on the DNS Tests item?
I'm looking into it. Not sure yet if it is caused by Xymon, or by the 
upgrade of the C-ARES library (which is used for all of the DNS stuff).
Think I got it now. Would You mind trying out this patch for me ? To
apply:

  cd xymon-4.3.4
  patch -p0 < /tmp/dns-timeout.patch
  make
  cp -p ~xymon/server/bin/xymonnet ~xymon/server/bin/xymonnet.orig
  cp xymonnet/xymonnet ~xymon/server/bin/

This should enforce the normal timeout-setting for DNS tests also.


Regards,
Henrik

The contents of this message may contain private, protected and/or privileged information.  If you received this message in error, you should destroy the e-mail message and any attachments or copies, and you are prohibited from retaining, distributing, disclosing or using any information contained within.  Please contact the sender and advise of the erroneous delivery by return e-mail or telephone.  Thank you for your cooperation.
list Ben Poppy · Thu, 25 Aug 2011 20:38:59 -0500 ·
Any progress on this issue by chance? Or is there any way to raise the time limit that it alerts at? I did try putting in the following one at a time in the xymonnet CMD line:

--no-ares
--timeout=500
--timelimit=500
--ping-tasks=10

None of them seemed to have had any effect (--timelimit said it wasn't a valid option as well, but that's what is listed in the MAN page)..

I'm ok with the tests taking a while, we don't seem to have an issue. Just want that yellow alert to get off the nongreen page, so our OPs staff stops bugging me about it :)

Thanks
quoted from Ben Poppy

-----Original Message-----
From: user-ce4a2c883f75@xymon.invalid [mailto:user-ce4a2c883f75@xymon.invalid] 
Sent: Friday, August 19, 2011 6:37 AM
To: xymon at xymon.com
Cc: Poppy, Ben
Subject: Re: [Xymon] Xymond and xymonnet quirks

On Thu, 18 Aug 2011 21:59:32 +0200, Henrik Størner <user-ce4a2c883f75@xymon.invalid> wrote:
On 18-08-2011 21:58, Poppy, Ben wrote:
Will do. Any ideas on the DNS Tests item?
I'm looking into it. Not sure yet if it is caused by Xymon, or by the 
upgrade of the C-ARES library (which is used for all of the DNS stuff).
Think I got it now. Would You mind trying out this patch for me ? To
apply:

  cd xymon-4.3.4
  patch -p0 < /tmp/dns-timeout.patch
  make
  cp -p ~xymon/server/bin/xymonnet ~xymon/server/bin/xymonnet.orig
  cp xymonnet/xymonnet ~xymon/server/bin/

This should enforce the normal timeout-setting for DNS tests also.


Regards,
Henrik

The contents of this message may contain private, protected and/or privileged information.  If you received this message in error, you should destroy the e-mail message and any attachments or copies, and you are prohibited from retaining, distributing, disclosing or using any information contained within.  Please contact the sender and advise of the erroneous delivery by return e-mail or telephone.  Thank you for your cooperation.
list Ben Poppy · Tue, 8 Nov 2011 16:55:14 -0600 ·
Any progress on this by chance?
quoted from Ben Poppy

-----Original Message-----
From: xymon-bounces at xymon.com [mailto:xymon-bounces at xymon.com] On Behalf Of Poppy, Ben
Sent: Thursday, August 25, 2011 8:39 PM
To: user-ce4a2c883f75@xymon.invalid; xymon at xymon.com
Subject: Re: [Xymon] Xymond and xymonnet quirks

Any progress on this issue by chance? Or is there any way to raise the time limit that it alerts at? I did try putting in the following one at a time in the xymonnet CMD line:

--no-ares
--timeout=500
--timelimit=500
--ping-tasks=10

None of them seemed to have had any effect (--timelimit said it wasn't a valid option as well, but that's what is listed in the MAN page)..

I'm ok with the tests taking a while, we don't seem to have an issue. Just want that yellow alert to get off the nongreen page, so our OPs staff stops bugging me about it :)

Thanks

-----Original Message-----
From: user-ce4a2c883f75@xymon.invalid [mailto:user-ce4a2c883f75@xymon.invalid]
Sent: Friday, August 19, 2011 6:37 AM
To: xymon at xymon.com
Cc: Poppy, Ben
Subject: Re: [Xymon] Xymond and xymonnet quirks

On Thu, 18 Aug 2011 21:59:32 +0200, Henrik Størner <user-ce4a2c883f75@xymon.invalid> wrote:
On 18-08-2011 21:58, Poppy, Ben wrote:
Will do. Any ideas on the DNS Tests item?
I'm looking into it. Not sure yet if it is caused by Xymon, or by the 
upgrade of the C-ARES library (which is used for all of the DNS stuff).
Think I got it now. Would You mind trying out this patch for me ? To
apply:

  cd xymon-4.3.4
  patch -p0 < /tmp/dns-timeout.patch
  make
  cp -p ~xymon/server/bin/xymonnet ~xymon/server/bin/xymonnet.orig
  cp xymonnet/xymonnet ~xymon/server/bin/

This should enforce the normal timeout-setting for DNS tests also.


Regards,
Henrik

The contents of this message may contain private, protected and/or privileged information.  If you received this message in error, you should destroy the e-mail message and any attachments or copies, and you are prohibited from retaining, distributing, disclosing or using any information contained within.  Please contact the sender and advise of the erroneous delivery by return e-mail or telephone.  Thank you for your cooperation.


The contents of this message may contain private, protected and/or privileged information.  If you received this message in error, you should destroy the e-mail message and any attachments or copies, and you are prohibited from retaining, distributing, disclosing or using any information contained within.  Please contact the sender and advise of the erroneous delivery by return e-mail or telephone.  Thank you for your cooperation.
list Steve Holmes · Tue, 1 May 2012 15:25:13 -0400 ·
We just ran into this same problem and removing the 'dns' tag from one DNS
server (non-production) which wasn't working reduced the time to do the DNS
tests from over 300 seconds to 22. This was after already tuning the
options on the xymonnet call.

Also, WRT the unknown option --timelimit, we think that in (4.3.7)
xymonnet.c starting at line 2011:

    else if (strcmp(argv[argi], "--timelimit=") == 0) {
      char *p = strchr(argv[argi], '=');
      p++; runtimewarn = atol(p);
    }
    else if (strcmp(argv[argi], "--huge=") == 0) {
      char *p = strchr(argv[argi], '=');
      p++; warnbytesread = atoi(p);
    }

Both of the strcmp calls should be argnmatch. The second one is for a
different option, of course, but it probably would get the same error
without this fix. Note, I haven't tested this, but I'm pretty confident
that it is right.

Steve

On Tue, Nov 8, 2011 at 5:55 PM, Poppy, Ben
quoted from Ben Poppy
<user-1ce99a2a9ef8@xymon.invalid>wrote:
Any progress on this by chance?

-----Original Message-----
From: xymon-bounces at xymon.com [mailto:xymon-bounces at xymon.com] On Behalf
Of Poppy, Ben
Sent: Thursday, August 25, 2011 8:39 PM
To: user-ce4a2c883f75@xymon.invalid; xymon at xymon.com
Subject: Re: [Xymon] Xymond and xymonnet quirks

Any progress on this issue by chance? Or is there any way to raise the
time limit that it alerts at? I did try putting in the following one at a
time in the xymonnet CMD line:

--no-ares
--timeout=500
--timelimit=500
--ping-tasks=10

None of them seemed to have had any effect (--timelimit said it wasn't a
valid option as well, but that's what is listed in the MAN page)..

I'm ok with the tests taking a while, we don't seem to have an issue. Just
want that yellow alert to get off the nongreen page, so our OPs staff stops
bugging me about it :)

Thanks

-----Original Message-----
From: user-ce4a2c883f75@xymon.invalid [mailto:user-ce4a2c883f75@xymon.invalid]
Sent: Friday, August 19, 2011 6:37 AM
To: xymon at xymon.com
Cc: Poppy, Ben
Subject: Re: [Xymon] Xymond and xymonnet quirks

On Thu, 18 Aug 2011 21:59:32 +0200, Henrik Størner <user-ce4a2c883f75@xymon.invalid> wrote:
On 18-08-2011 21:58, Poppy, Ben wrote:
Will do. Any ideas on the DNS Tests item?
I'm looking into it. Not sure yet if it is caused by Xymon, or by the
upgrade of the C-ARES library (which is used for all of the DNS stuff).
Think I got it now. Would You mind trying out this patch for me ? To
apply:

 cd xymon-4.3.4
 patch -p0 < /tmp/dns-timeout.patch
 make
 cp -p ~xymon/server/bin/xymonnet ~xymon/server/bin/xymonnet.orig
 cp xymonnet/xymonnet ~xymon/server/bin/

This should enforce the normal timeout-setting for DNS tests also.


Regards,
Henrik

The contents of this message may contain private, protected and/or
privileged information.  If you received this message in error, you should
destroy the e-mail message and any attachments or copies, and you are
prohibited from retaining, distributing, disclosing or using any
information contained within.  Please contact the sender and advise of the
erroneous delivery by return e-mail or telephone.  Thank you for your
cooperation.


The contents of this message may contain private, protected and/or
privileged information.  If you received this message in error, you should
destroy the e-mail message and any attachments or copies, and you are
prohibited from retaining, distributing, disclosing or using any
information contained within.  Please contact the sender and advise of the
erroneous delivery by return e-mail or telephone.  Thank you for your
cooperation.

-- 

If they give you ruled paper, write the other way. -Juan Ramon Jimenez,
poet, Nobel Prize in literature (1881-1958)

I prayed for freedom for twenty years, but received no answer until I
prayed with my legs. -Frederick Douglass, Former slave, abolitionist,
editor, and orator (1817-1895)
list Henrik Størner · Wed, 02 May 2012 07:40:22 +0200 ·
quoted from Steve Holmes
On 01-05-2012 21:25, Steve Holmes wrote:
We just ran into this same problem and removing the 'dns' tag from one
DNS server (non-production) which wasn't working reduced the time to do
the DNS tests from over 300 seconds to 22. This was after already tuning
the options on the xymonnet call.
I've learned a lot more about the DNS timeout handling in C-ARES after this thread started back in August.

The DNS timeout-settings in 4.3.7 really are broken, in that the DNS library doesn't work the way I thought it did when I wrote the code. Instead of being a timeout on the total DNS lookup, the timeout setting is the timeout of the initial DNS query - which will then be retried with exponentially higher timeouts a number of times (4, I think is the default). The net effect is that the default timeout settings in Xymon results in DNS queries that take about 30 minutes to timeout.

The attached patch against 4.3.7 will change the way the timeout settings to two fixed values, resulting in a timeout for DNS operations of approximately 23 seconds.
quoted from Steve Holmes
Also, WRT the unknown option --timelimit, we think that in (4.3.7)
xymonnet.c starting at line 2011:

     else if (strcmp(argv[argi], "--timelimit=") == 0) {
       char *p = strchr(argv[argi], '=');
       p++; runtimewarn = atol(p);
     }
     else if (strcmp(argv[argi], "--huge=") == 0) {
       char *p = strchr(argv[argi], '=');
       p++; warnbytesread = atoi(p);
     }

Both of the strcmp calls should be argnmatch. The second one is for a
different option, of course, but it probably would get the same error
without this fix. Note, I haven't tested this, but I'm pretty confident
that it is right.
You are right, of course.


Regards,
Henrik
Attachments (1)
list Ben Poppy · Wed, 2 May 2012 05:48:57 +0000 ·
I believe this is the same patch you provided to me a few weeks ago. I am extremely happy to report that we've had no purple storms since then..
quoted from Henrik Størner

-----Original Message-----
From: xymon-bounces at xymon.com [mailto:xymon-bounces at xymon.com] On Behalf Of Henrik Størner
Sent: Wednesday, May 02, 2012 12:40 AM
To: xymon at xymon.com
Subject: Re: [Xymon] Xymond and xymonnet quirks

On 01-05-2012 21:25, Steve Holmes wrote:
We just ran into this same problem and removing the 'dns' tag from one DNS server (non-production) which wasn't working reduced the time to do the DNS tests from over 300 seconds to 22. This was after already tuning the options on the xymonnet call.
I've learned a lot more about the DNS timeout handling in C-ARES after this thread started back in August.

The DNS timeout-settings in 4.3.7 really are broken, in that the DNS library doesn't work the way I thought it did when I wrote the code. Instead of being a timeout on the total DNS lookup, the timeout setting is the timeout of the initial DNS query - which will then be retried with exponentially higher timeouts a number of times (4, I think is the default). The net effect is that the default timeout settings in Xymon results in DNS queries that take about 30 minutes to timeout.

The attached patch against 4.3.7 will change the way the timeout settings to two fixed values, resulting in a timeout for DNS operations of approximately 23 seconds.
Also, WRT the unknown option --timelimit, we think that in (4.3.7) xymonnet.c starting at line 2011:

     else if (strcmp(argv[argi], "--timelimit=") == 0) {
       char *p = strchr(argv[argi], '=');
       p++; runtimewarn = atol(p);
     }
     else if (strcmp(argv[argi], "--huge=") == 0) {
       char *p = strchr(argv[argi], '=');
       p++; warnbytesread = atoi(p);
     }

Both of the strcmp calls should be argnmatch. The second one is for a different option, of course, but it probably would get the same error without this fix. Note, I haven't tested this, but I'm pretty confident that it is right.
You are right, of course.


Regards,
Henrik

The contents of this message may contain private, protected and/or privileged information.  If you received this message in error, you should destroy the e-mail message and any attachments or copies, and you are prohibited from retaining, distributing, disclosing or using any information contained within.  Please contact the sender and advise of the erroneous delivery by return e-mail or telephone.  Thank you for your cooperation.