Xymon Mailing List Archive search

Runtime longer than time limit (300)

4 messages in this thread

list Chris Wopat · Tue, 01 Sep 2009 08:59:03 -0500 ·
Unsure if this is a bug or not. I noticed that for a few days I've been getting "Runtime 483 longer than time limit (300)" and the bbtest column shows:

TIME SPENT
<snip>
DNS tests executed	450.074868
<snip>
TIME TOTAL		482.741716


I was able to successfully debug this by adding "--debug" to the [bbnet] CMD in hobbitlaunch.cfg. This revealed:


2009-09-01 08:26:01 ares_search: tlookup='dns.server.com', class=1, type=1
2009-09-01 08:26:01 Processing 0 DNS lookups with ARES
2009-09-01 08:33:31 Finished ARES queue after loop 20

Note that it took 7 minutes to complete this. Where 'dns.server.com' is a server that we took offline on the day the issue began. We didn't remove this from bb-hosts and instead marked it blue.

The fix was to comment it out from bb-hosts. It says "loop 20" above so I was searching for config options that listed 20 as a value but didn't come up with anything.

The issue is resolved on my end but I'd like to find out if this is a bug or expected behavior. It seems that a single DNS check on a host that's down shouldn't delay things by 7 minutes.

I'm running Xymon v4.2.3 on FreeBSD 7.2-RELEASE.

Thanks,
Chris
list Josh Luthman · Tue, 1 Sep 2009 10:21:40 -0400 ·
Even when blue Xymon does the tests.  If the server is offline it will
timeout on the DNS tests (as well as other tests obvIously).

No bug, expected behavior.
quoted from Chris Wopat

On 9/1/09, Chris Wopat <user-8ece45634613@xymon.invalid> wrote:
Unsure if this is a bug or not. I noticed that for a few days I've been
getting "Runtime 483 longer than time limit (300)" and the bbtest column
shows:

TIME SPENT
<snip>
DNS tests executed	450.074868
<snip>
TIME TOTAL		482.741716


I was able to successfully debug this by adding "--debug" to the [bbnet]
CMD in hobbitlaunch.cfg. This revealed:


2009-09-01 08:26:01 ares_search: tlookup='dns.server.com', class=1, type=1
2009-09-01 08:26:01 Processing 0 DNS lookups with ARES
2009-09-01 08:33:31 Finished ARES queue after loop 20

Note that it took 7 minutes to complete this. Where 'dns.server.com' is
a server that we took offline on the day the issue began. We didn't
remove this from bb-hosts and instead marked it blue.

The fix was to comment it out from bb-hosts. It says "loop 20" above so
I was searching for config options that listed 20 as a value but didn't
come up with anything.

The issue is resolved on my end but I'd like to find out if this is a
bug or expected behavior. It seems that a single DNS check on a host
that's down shouldn't delay things by 7 minutes.

I'm running Xymon v4.2.3 on FreeBSD 7.2-RELEASE.

Thanks,
Chris

-- 

Sent from my mobile device

Josh Luthman
Office: XXX-XXX-XXXX
Direct: XXX-XXX-XXXX
XXXX Wayne St
Suite XXXX
Troy, OH XXXXX

"When you have eliminated the impossible, that which remains, however
improbable, must be the truth."
--- Sir Arthur Conan Doyle
list Chris Wopat · Tue, 01 Sep 2009 10:34:17 -0500 ·
quoted from Josh Luthman
Josh Luthman wrote:
Even when blue Xymon does the tests.  If the server is offline it will
timeout on the DNS tests (as well as other tests obvIously).

No bug, expected behavior.
It taking 7 minutes to timeout still seems wrong. I also normally expect if a server is unreachable via ICMP it doesn't do further tests.

 From what I can tell the DNS check comes from [bbnet] which I have at a default of 15:

[bbnet]
         ENVFILE /usr/local/www/hobbit/server/etc/hobbitserver.cfg
         NEEDS hobbitd
         CMD bbtest-net --report --ping --checkresponse --timeout=15
         LOGFILE $BBSERVERLOGS/bb-network.log
         INTERVAL 5m

`bbtest.net --help` lists a --dns-timeout option but that says it defaults to 30, so something still seems wrong.

--Chris
list David Baldwin · Wed, 2 Sep 2009 10:10:05 +1000 ·
quoted from Chris Wopat
Chris Wopat wrote:
Unsure if this is a bug or not. I noticed that for a few days I've been 
getting "Runtime 483 longer than time limit (300)" and the bbtest column 
shows:

TIME SPENT
<snip>
DNS tests executed	450.074868
<snip>
TIME TOTAL		482.741716


I was able to successfully debug this by adding "--debug" to the [bbnet] 
CMD in hobbitlaunch.cfg. This revealed:


2009-09-01 08:26:01 ares_search: tlookup='dns.server.com', class=1, type=1
2009-09-01 08:26:01 Processing 0 DNS lookups with ARES
2009-09-01 08:33:31 Finished ARES queue after loop 20

Note that it took 7 minutes to complete this. Where 'dns.server.com' is 
a server that we took offline on the day the issue began. We didn't 
remove this from bb-hosts and instead marked it blue.

The fix was to comment it out from bb-hosts. It says "loop 20" above so 
I was searching for config options that listed 20 as a value but didn't 
come up with anything.

The issue is resolved on my end but I'd like to find out if this is a 
bug or expected behavior. It seems that a single DNS check on a host 
that's down shouldn't delay things by 7 minutes.

I'm running Xymon v4.2.3 on FreeBSD 7.2-RELEASE.
  
I've also tuned my DNS tests. The [bbnet] section in hobbitlaunch.cfg is
the place to do it.
See 'man bbtest-net' for options.
I've upped the general timeout because I have a slow web server. I might
shift known ones to urlplus because I can do custom timeouts on each server.
My settings are:
[bbnet]
        ENVFILE /usr/lib/hobbit/server/etc/hobbitserver.cfg
        NEEDS hobbitd
        CMD bbtest-net --report --ping --checkresponse --dns-timeout=15
--timeout=20 --concurrency=512
        LOGFILE $BBSERVERLOGS/bb-network.log
        INTERVAL 5m

Check the 'bbtest' column for your hobbit server for stats on bbtest-net
runtimes - there's an RRD graph in there too.

David.

-- 
David Baldwin - IT Unit
Australian Sports Commission          www.ausport.gov.au
Tel 02 62147830 Fax 02 62141830       PO Box 176 Belconnen ACT 2616
user-cbbf693f2c89@xymon.invalid          Leverrier Street Bruce ACT 2617


Keep up to date with what's happening in Australian sport visit http://www.ausport.gov.au

This message is intended for the addressee named and may contain confidential and privileged information. If you are not the intended recipient please note that any form of distribution, copying or use of this communication or the information in it is strictly prohibited and may be unlawful. If you receive this message in error, please delete it and notify the sender.