Xymon Mailing List Archive search

Conn time discrepancy

3 messages in this thread

list Bill Hart · Wed, 7 Oct 2009 14:40:30 -0500 ·
I've searched the archives and didn't find anything, I'm hoping someone
here has some insight.

Occasionally we'll notice some drastic change in the graphing on Xymon for
the conn times, and when I bring these up with the networking folks they
basically dismiss it because the labels on the side of the scale are
"wrong".  What I'm seeing is that the Y axis is being labeled as "Seconds"
with the "m" after the scale, and it reports around 40-80mS for ping times.
This isn't what we're seeing from the box itself though when we execute the
same hobbitping :

From Xymon page :
<host ip> is alive (44 ms)

From command line on Xymon :

<host ip> is alive (0.13 ms)

What am I missing and why the discrepancy ?

Bill Hart


Notice:
This communication is an electronic communication within the meaning of the Electronic Communications Privacy Act, 18 U.S.C. � 2510.  Its disclosure is strictly limited to the recipient(s) intended by the sender of this message.  This transmission and any attachments may contain proprietary, confidential, attorney-client privileged information and/or attorney work product. If you are not the intended recipient, any disclosure, copying, distribution, reliance on, or use of any of the information contained herein is STRICTLY PROHIBITED.  Please destroy the original transmission and its attachments without reading or saving in any matter and confirm by return email.
list Alan Sparks · Wed, 07 Oct 2009 13:49:40 -0600 ·
quoted from Bill Hart
user-079de6b18352@xymon.invalid wrote:
I've searched the archives and didn't find anything, I'm hoping someone
here has some insight.

Occasionally we'll notice some drastic change in the graphing on Xymon for
the conn times, and when I bring these up with the networking folks they
basically dismiss it because the labels on the side of the scale are
"wrong".  What I'm seeing is that the Y axis is being labeled as "Seconds"
with the "m" after the scale, and it reports around 40-80mS for ping times.
This isn't what we're seeing from the box itself though when we execute the
same hobbitping :

From Xymon page :
<host ip> is alive (44 ms)

From command line on Xymon :

<host ip> is alive (0.13 ms)

What am I missing and why the discrepancy ?

Bill Hart
  
This is a known issue with the way hobbitping works, compared to fping. 
If you want realistic ping times, you need to use fping.  You'll need to
install fping, and change the "FPING" entry in hobbitserver.cfg.  Also,
be sure xymon can run it, since it usually installs setuid - either set
up a sudo mechanism to call it, or change the group on fping to the
xymon server's group and make sure it's setuid-root and group executable.
-Alan
list Henrik Størner · Wed, 7 Oct 2009 20:24:33 +0000 (UTC) ·
quoted from Alan Sparks
In <user-75d9337cc2fd@xymon.invalid> Alan Sparks <user-8f2174fd8b66@xymon.invalid> writes:
user-079de6b18352@xymon.invalid wrote:
From Xymon page :
<host ip> is alive (44 ms)

From command line on Xymon :

<host ip> is alive (0.13 ms)
This is a known issue with the way hobbitping works, compared to fping. 
If you want realistic ping times, you need to use fping.
I mostly agree.

One reason for this is that Xymon's ping-function is meant to test whether
the host is on the network; it isn't meant (really) to test how fast the
network connection is to the host. So Xymon runs loads of ping-tests in
parallel, and can easily get a much higher round-trip time than a ping
of a single host, running on an otherwise idle system. Of course - 
Xymon will notice if it takes 10 ms or 20 seconds to ping a host, but
whether it is 2 or 50 milliseconds really should not be cause for
concern. (E.g. if the ARP entry for the IP address is in the ARP cache
it will be some milliseconds faster then when it isn't, so the responsetime
depends on a "random" timeout in the OS kernel and the size of your ARP cache).

And hobbitping is even more aggressive with respect to running ping's
in parallel than fping.

I had a curious experience the other day. We've been upgrading the hardware
used for network tests from a (very) old Sun box to a somewhat newer and
more powerful Intel system. A couple of weeks later one of our account
managers came by; they were trying to figure out why the ping-times 
they could see in Xymon had suddenly become so much better. They hadn't
changed anything on the hosts or the network - but the Xymon server was
handling the ping tests much faster.

Performance testing is difficult.


Regards,
Henrik