Xymon Mailing List Archive search

Debugging help: bbtest-net gets http test timing wrong

list Shane Skoglund
Wed, 18 Jun 2008 09:02:35 -0500
Message-Id: <user-15975416adb3@xymon.invalid>

Did you rebuild the hobbit binaries on a 64 bit machine?  Or did you install
the all the 32 bit compat libs?


On Tue, Jun 17, 2008 at 6:00 PM, Alan Sparks <user-8f2174fd8b66@xymon.invalid>
wrote:
After some Googling, I have added "AcceptFilter http none" directives to
the Apache 2.2 servers, which hasn't really helped anything...

Perhaps I should ask:  Can anyone verify Hobbit works correctly on a 64-bit
system?  Not should, but does, on a Centos 4 or RHEL 4 x86_64 install?

I see a lot of debugging trace stuff (dbgprint calls) in the contest and
httptest code.  Can anyone tell me how to enable it to trace what Hobbit is
doing?

Am really at a loss.  This can't be rocket science to get it to probe HTTP
correctly.  But a week later, I still cannot get it to match any other
monitoring tool's results.
-Alan

Alan Sparks wrote:
tcpdumps show a couple of interesting points.

1) There are definitely no DNS lookups occurring as a consequence of the
Hobbit probes.  No port 53 traffic out.

2) The packets from the Hobbit server, and the incoming packets to the
Apache server, sometimes look like:

15:20:01.160095 IP (tos 0x0, ttl  62, id 31129, offset 0, flags [DF],
proto 6, length: 60) hobbit.45116 > target.http: S [tcp sum ok]
265769416:265769416(0) win 17520 <mss 8760,sackOK,timestamp 143665233
0,nop,wscale 2>

15:20:04.159715 IP (tos 0x0, ttl  62, id 31131, offset 0, flags [DF],
proto 6, length: 60) hobbit.45116 > target.http: S [tcp sum ok]
265769416:265769416(0) win 17520 <mss 8760,sackOK,timestamp 143668233
0,nop,wscale 2>

15:20:04.160223 IP (tos 0x0, ttl  62, id 31133, offset 0, flags [DF],
proto 6, length: 40) hobbit.45116 > target.http: . [tcp sum ok]
265769417:265769417(0) ack 1051782089 win 17520

So that accounts for three seconds... it appears there are 2 SYN packets,
but the first isn't getting processed and there's a 3-second delay to the
next SYN (which gets ACKed).  I don't know why this happens only with the
Hobbit connections... and I don't know why the first SYN seems to be getting
ignored.  Server is not at all busy.

-Alan
Tim McCloskey wrote:
I get that wget/curl always work.  Not sure what resolver settings may be
implemented differently for hobbit.

Still thinking this may be unrelated to hobbit (even though wget/curl
work fine for you).  We have many apache boxes spanning multiple networks
running httpd versions 1.3, 2.0 and 2.2 that hobbit(4.2 with allinone patch)
likes just fine and reports accurate times (Seconds: 0.nn).  We also have
fairly proper forward and reverse DNS records for the systems involved.

I can't imagine hobbit parsing the wrong response times, but if that is
the case I wonder what external libraries are used (not hobbit provided
libs, as ours parse fine and are likely the same as yours).

Anyway, good luck with the tcpdump.

Regards,

Tim


Alan Sparks wrote:
UseCanonicalName is off, and HostNameLookup is off, on every server,
regardless of version.
-Alan

Tim McCloskey wrote:
What do you have for
UseCanonicalName
in the apache 2.0 boxes?