Xymon Mailing List Archive search

strange result in fping by hobbit and fping by yours truly

8 messages in this thread

list Dennis Ortsen · Fri, 2 Nov 2007 13:46:44 +0100 ·
Hi everyone,

I've got a somewhat strange issue with fping in hobbit.

I've got a list of about 122 terminals that are used for a chipcard payment
system. Each one of them has an ip-address and connects to a server to
process the payments. Every now and then the terminals loose the connection
or whatever, they need to be reset before they can process the payments
again on the server. I thought I might use Hobbit to ping all the terminals
to see if they still respond to a ping. If they don't reply anymore, they
need a reset.

I only do a conn test on an ip-address. I use the full path to fping in
hobbitserver.cfg, without any extra parameters. When hobbit starts testing
the connection, only 14 terminals respond to a ping, but when I execute a
fping myself (as the hobbit user) in a shell to the same amount of
terminals, I get a totally different result, instead of 14 responding
terminals, I get 95 responding terminals with fping! It's not just a lucky
shot, I can keep on trying these terminals, the huge difference remains
whether I fping them myself or when hobbit fpings them.

I'm running hobbit 4.2.0. According to bbtest I have 806 hosts that are
pinged, that takes about 26 seconds to complete. To complete all tests
(954), it takes about 47 seconds.

Does anyone have a clue why specifically these terminals have such a
difference in hobbit ping and a ping performed by myself?

I can't explain it, and it doesn't seem like a timeout (latency) issue to me
either.

Thanks in advance,

Br.

Dennis
list Greg L Hubbard · Fri, 2 Nov 2007 09:20:58 -0500 ·
DNS? 

-----Original Message-----
From: Dennis Ortsen [mailto:user-8b22a8e3a886@xymon.invalid] 
Sent: Friday, November 02, 2007 7:47 AM
To: user-ae9b8668bcde@xymon.invalid
Subject: [hobbit] strange result in fping by hobbit and fping by yours
truly
quoted from Dennis Ortsen


Hi everyone,

I've got a somewhat strange issue with fping in hobbit.

I've got a list of about 122 terminals that are used for a chipcard
payment system. Each one of them has an ip-address and connects to a
server to process the payments. Every now and then the terminals loose
the connection or whatever, they need to be reset before they can
process the payments again on the server. I thought I might use Hobbit
to ping all the terminals to see if they still respond to a ping. If
they don't reply anymore, they need a reset.

I only do a conn test on an ip-address. I use the full path to fping in
hobbitserver.cfg, without any extra parameters. When hobbit starts
testing the connection, only 14 terminals respond to a ping, but when I
execute a fping myself (as the hobbit user) in a shell to the same
amount of terminals, I get a totally different result, instead of 14
responding terminals, I get 95 responding terminals with fping! It's not
just a lucky shot, I can keep on trying these terminals, the huge
difference remains whether I fping them myself or when hobbit fpings
them.

I'm running hobbit 4.2.0. According to bbtest I have 806 hosts that are
pinged, that takes about 26 seconds to complete. To complete all tests
(954), it takes about 47 seconds.

Does anyone have a clue why specifically these terminals have such a
difference in hobbit ping and a ping performed by myself?

I can't explain it, and it doesn't seem like a timeout (latency) issue
to me either.

Thanks in advance,

Br.

Dennis
list Dennis Ortsen · Fri, 2 Nov 2007 16:28:29 +0100 ·
These terminals are not in DNS. There's also a large amount of Cisco
accesspoints that are fpinged the same way. These AP's are also not in DNS.
The number of AP's is even larger than the number of terminals: 247.
quoted from Greg L Hubbard
-----Oorspronkelijk bericht-----
Van: Hubbard, Greg L [mailto:user-d970b5e56ec9@xymon.invalid] Verzonden: vrijdag 2 november 2007 15:21
Aan: user-ae9b8668bcde@xymon.invalid
Onderwerp: RE: [hobbit] strange result in fping by hobbit and fping by yours truly

DNS? 
-----Original Message-----
From: Dennis Ortsen [mailto:user-8b22a8e3a886@xymon.invalid]
Sent: Friday, November 02, 2007 7:47 AM
To: user-ae9b8668bcde@xymon.invalid
Subject: [hobbit] strange result in fping by hobbit and fping by yours truly


Hi everyone,

I've got a somewhat strange issue with fping in hobbit.

I've got a list of about 122 terminals that are used for a chipcard
payment system. Each one of them has an ip-address and connects to a
server to process the payments. Every now and then the terminals loose
the connection or whatever, they need to be reset before they can
process the payments again on the server. I thought I might use Hobbit
to ping all the terminals to see if they still respond to a ping. If
they don't reply anymore, they need a reset.

I only do a conn test on an ip-address. I use the full path to fping in
hobbitserver.cfg, without any extra parameters. When hobbit starts
testing the connection, only 14 terminals respond to a ping, but when I
execute a fping myself (as the hobbit user) in a shell to the same
amount of terminals, I get a totally different result, instead of 14
responding terminals, I get 95 responding terminals with fping! It's not
just a lucky shot, I can keep on trying these terminals, the huge
difference remains whether I fping them myself or when hobbit fpings
them.

I'm running hobbit 4.2.0. According to bbtest I have 806 hosts that are
pinged, that takes about 26 seconds to complete. To complete all tests
(954), it takes about 47 seconds.

Does anyone have a clue why specifically these terminals have such a
difference in hobbit ping and a ping performed by myself?

I can't explain it, and it doesn't seem like a timeout (latency) issue
to me either.

Thanks in advance,

Br.

Dennis

list Hobbit User · Fri, 2 Nov 2007 11:36:56 -0400 (EDT) ·
You're aware, aren't you, that unless you have "testip" keyword on the
bb-hosts line, hobbit DNS resolves the bb-hosts name and uses the bb-hosts
IP only if the name lookup fails?
quoted from Dennis Ortsen

On Fri, November 2, 2007 11:28, Dennis Ortsen wrote:
These terminals are not in DNS. There's also a large amount of Cisco
accesspoints that are fpinged the same way. These AP's are also not in
DNS.
The number of AP's is even larger than the number of terminals: 247.
-----Oorspronkelijk bericht-----
Van: Hubbard, Greg L [mailto:user-d970b5e56ec9@xymon.invalid]
Verzonden: vrijdag 2 november 2007 15:21
Aan: user-ae9b8668bcde@xymon.invalid
Onderwerp: RE: [hobbit] strange result in fping by hobbit and
fping by yours truly

DNS?

-----Original Message-----
From: Dennis Ortsen [mailto:user-8b22a8e3a886@xymon.invalid]
Sent: Friday, November 02, 2007 7:47 AM
To: user-ae9b8668bcde@xymon.invalid
Subject: [hobbit] strange result in fping by hobbit and fping
by yours truly


Hi everyone,

I've got a somewhat strange issue with fping in hobbit.

I've got a list of about 122 terminals that are used for a chipcard
payment system. Each one of them has an ip-address and connects to a
server to process the payments. Every now and then the terminals loose
the connection or whatever, they need to be reset before they can
process the payments again on the server. I thought I might use Hobbit
to ping all the terminals to see if they still respond to a ping. If
they don't reply anymore, they need a reset.

I only do a conn test on an ip-address. I use the full path
to fping in
hobbitserver.cfg, without any extra parameters. When hobbit starts
testing the connection, only 14 terminals respond to a ping,
but when I
execute a fping myself (as the hobbit user) in a shell to the same
amount of terminals, I get a totally different result, instead of 14
responding terminals, I get 95 responding terminals with
fping! It's not
just a lucky shot, I can keep on trying these terminals, the huge
difference remains whether I fping them myself or when hobbit fpings
them.

I'm running hobbit 4.2.0. According to bbtest I have 806
hosts that are
pinged, that takes about 26 seconds to complete. To complete all tests
(954), it takes about 47 seconds.

Does anyone have a clue why specifically these terminals have such a
difference in hobbit ping and a ping performed by myself?

I can't explain it, and it doesn't seem like a timeout (latency) issue
to me either.

Thanks in advance,

Br.

Dennis

list Josh Luthman · Fri, 2 Nov 2007 11:49:51 -0400 ·
What was suggested was that you have something like

172.16.1.50 terminal150.notmydomain.com #

Here Hobbit is attempting to resolve terminal150.notmydomain.com, fails, and
then pings the IP address.  Since you know the IP and it isn't going to
change, try

172.16.1.50 terminal150.notmydomain.com # testip

See if that resolves your issue.  Note you can click on the host on the web
interface and it will say the IP it is pinging.
quoted from Hobbit User

On 11/2/07, Hobbit User <user-24d6f8323faa@xymon.invalid> wrote:
You're aware, aren't you, that unless you have "testip" keyword on the
bb-hosts line, hobbit DNS resolves the bb-hosts name and uses the bb-hosts
IP only if the name lookup fails?

On Fri, November 2, 2007 11:28, Dennis Ortsen wrote:
These terminals are not in DNS. There's also a large amount of Cisco
accesspoints that are fpinged the same way. These AP's are also not in
DNS.
The number of AP's is even larger than the number of terminals: 247.
-----Oorspronkelijk bericht-----
Van: Hubbard, Greg L [mailto:user-d970b5e56ec9@xymon.invalid]
Verzonden: vrijdag 2 november 2007 15:21
Aan: user-ae9b8668bcde@xymon.invalid
Onderwerp: RE: [hobbit] strange result in fping by hobbit and
fping by yours truly

DNS?

-----Original Message-----
From: Dennis Ortsen [mailto:user-8b22a8e3a886@xymon.invalid]
Sent: Friday, November 02, 2007 7:47 AM
To: user-ae9b8668bcde@xymon.invalid
Subject: [hobbit] strange result in fping by hobbit and fping
by yours truly


Hi everyone,

I've got a somewhat strange issue with fping in hobbit.

I've got a list of about 122 terminals that are used for a chipcard
payment system. Each one of them has an ip-address and connects to a
server to process the payments. Every now and then the terminals loose
the connection or whatever, they need to be reset before they can
process the payments again on the server. I thought I might use Hobbit
to ping all the terminals to see if they still respond to a ping. If
they don't reply anymore, they need a reset.

I only do a conn test on an ip-address. I use the full path
to fping in
hobbitserver.cfg, without any extra parameters. When hobbit starts
testing the connection, only 14 terminals respond to a ping,
but when I
execute a fping myself (as the hobbit user) in a shell to the same
amount of terminals, I get a totally different result, instead of 14
responding terminals, I get 95 responding terminals with
fping! It's not
just a lucky shot, I can keep on trying these terminals, the huge
difference remains whether I fping them myself or when hobbit fpings
them.

I'm running hobbit 4.2.0. According to bbtest I have 806
hosts that are
pinged, that takes about 26 seconds to complete. To complete all tests
(954), it takes about 47 seconds.

Does anyone have a clue why specifically these terminals have such a
difference in hobbit ping and a ping performed by myself?

I can't explain it, and it doesn't seem like a timeout (latency) issue
to me either.

Thanks in advance,

Br.

Dennis

-- 

Josh Luthman
Office: XXX-XXX-XXXX
Direct: XXX-XXX-XXXX
XXXX Wayne St
Suite XXXX
Troy, OH XXXXX

Those who don't understand UNIX are condemned to reinvent it, poorly.
--- Henry Spencer
list Henrik Størner · Fri, 2 Nov 2007 18:06:32 +0100 ·
quoted from Dennis Ortsen
On Fri, Nov 02, 2007 at 01:46:44PM +0100, Dennis Ortsen wrote:
I only do a conn test on an ip-address. I use the full path to fping in
hobbitserver.cfg, without any extra parameters. When hobbit starts testing
the connection, only 14 terminals respond to a ping, but when I execute a
fping myself (as the hobbit user) in a shell to the same amount of
terminals, I get a totally different result, instead of 14 responding
terminals, I get 95 responding terminals with fping! It's not just a lucky
shot, I can keep on trying these terminals, the huge difference remains
whether I fping them myself or when hobbit fpings them.

I'm running hobbit 4.2.0. According to bbtest I have 806 hosts that are
pinged, that takes about 26 seconds to complete. To complete all tests
(954), it takes about 47 seconds.

Does anyone have a clue why specifically these terminals have such a
difference in hobbit ping and a ping performed by myself?
What error does the failed ping tests show - DNS error, or just ping
failure ? If it's DNS errors, use the "testip" tag to avoid doing DNS
lookups - I understand from the other mails in the thread that these
systems are not in the DNS.

I suspect it might be an issue with the number of tests running
simultaneously. ICMP packets have lower priority in most network
equipment, and is therefore the first packets to be discarded when there
is a lot of traffic on the network. Since you're referring to
"terminals" they might not have a lot of memory for network buffers and
therefore they might drop packets more often than "real" computers.

There is also a possibility that the number of hosts tested is
overflowing the ARP cache table on the Hobbit server, which can lead to
packets being lost.

I'd suggest starting with some extra options for bbtest-net (in
hobbitlaunch.cfg): Add --concurrency=32 to the bbtest-net command, and
change the FPING setting in hobbitserver.cfg to FPING="fping -i150",
this will increase the time fping waits between sending packets from 25
ms to 150 ms, so there is less ICMP traffic on the network.

You can replicate how Hobbit performs the ping test by putting the IP's
of all your hosts into a text file (one IP per line), then run
    fping -Ae </tmp/IPlist.txt


Regards,
Henrik
list Dennis Ortsen · Mon, 5 Nov 2007 09:14:08 +0100 ·
The testping tag solved my problem.

Strange however that with that large number of access points I didn't have
the same issue. Perhaps it's just the additional number of IP-addresses that
are not resolvable that made this visible?

Anyway, I now get the correct results, thanks guys.

Br.

Dennis 
quoted from Henrik Størner
-----Oorspronkelijk bericht-----
Van: Henrik Stoerner [mailto:user-ce4a2c883f75@xymon.invalid] Verzonden: vrijdag 2 november 2007 18:07
Aan: user-ae9b8668bcde@xymon.invalid
Onderwerp: Re: [hobbit] strange result in fping by hobbit and fping by yourstruly

On Fri, Nov 02, 2007 at 01:46:44PM +0100, Dennis Ortsen wrote:
I only do a conn test on an ip-address. I use the full path to fping > in hobbitserver.cfg, without any extra parameters. When hobbit starts > testing the connection, only 14 terminals respond to a ping, but when > I execute a fping myself (as the hobbit user) in a shell to the same > amount of terminals, I get a totally different result, instead of 14 > responding terminals, I get 95 responding terminals with fping! It's > not just a lucky shot, I can keep on trying these terminals, the huge > difference remains whether I fping them myself or when hobbit fpings them.
I'm running hobbit 4.2.0. According to bbtest I have 806 hosts that > are pinged, that takes about 26 seconds to complete. To complete all > tests (954), it takes about 47 seconds.
Does anyone have a clue why specifically these terminals have such a > difference in hobbit ping and a ping performed by myself?
What error does the failed ping tests show - DNS error, or just ping failure ? If it's DNS errors, use the "testip" tag to avoid doing DNS lookups - I understand from the other mails in the thread that these systems are not in the DNS.

I suspect it might be an issue with the number of tests running simultaneously. ICMP packets have lower priority in most network equipment, and is therefore the first packets to be discarded when there is a lot of traffic on the network. Since you're referring to "terminals" they might not have a lot of memory for network buffers and therefore they might drop packets more often than "real" computers.

There is also a possibility that the number of hosts tested is overflowing the ARP cache table on the Hobbit server, which can lead to packets being lost.

I'd suggest starting with some extra options for bbtest-net (in
hobbitlaunch.cfg): Add --concurrency=32 to the bbtest-net command, and change the FPING setting in hobbitserver.cfg to FPING="fping -i150", this will increase the time fping waits between sending packets from 25 ms to 150 ms, so there is less ICMP traffic on the network.

You can replicate how Hobbit performs the ping test by putting the IP's of all your hosts into a text file (one IP per line), then run
    fping -Ae </tmp/IPlist.txt


Regards,
Henrik

list Josh Luthman · Mon, 5 Nov 2007 10:35:18 -0500 ·
You may have had ap1.foo.com and ap2.foobar.com (two different domain, one
of which is not registered).
quoted from Dennis Ortsen

On 11/5/07, Dennis Ortsen <user-8b22a8e3a886@xymon.invalid> wrote:
The testping tag solved my problem.

Strange however that with that large number of access points I didn't have
the same issue. Perhaps it's just the additional number of IP-addresses
that
are not resolvable that made this visible?

Anyway, I now get the correct results, thanks guys.

Br.

Dennis
-----Oorspronkelijk bericht-----
Van: Henrik Stoerner [mailto:user-ce4a2c883f75@xymon.invalid]
Verzonden: vrijdag 2 november 2007 18:07
Aan: user-ae9b8668bcde@xymon.invalid
Onderwerp: Re: [hobbit] strange result in fping by hobbit and
fping by yourstruly

On Fri, Nov 02, 2007 at 01:46:44PM +0100, Dennis Ortsen wrote:
I only do a conn test on an ip-address. I use the full path
to fping
in hobbitserver.cfg, without any extra parameters. When
hobbit starts
testing the connection, only 14 terminals respond to a
ping, but when
I execute a fping myself (as the hobbit user) in a shell to
the same
amount of terminals, I get a totally different result,
instead of 14
responding terminals, I get 95 responding terminals with
fping! It's
not just a lucky shot, I can keep on trying these
terminals, the huge
difference remains whether I fping them myself or when
hobbit fpings them.

I'm running hobbit 4.2.0. According to bbtest I have 806 hosts that
are pinged, that takes about 26 seconds to complete. To
complete all
tests (954), it takes about 47 seconds.

Does anyone have a clue why specifically these terminals
have such a
difference in hobbit ping and a ping performed by myself?
What error does the failed ping tests show - DNS error, or
just ping failure ? If it's DNS errors, use the "testip" tag
to avoid doing DNS lookups - I understand from the other
mails in the thread that these systems are not in the DNS.

I suspect it might be an issue with the number of tests
running simultaneously. ICMP packets have lower priority in
most network equipment, and is therefore the first packets to
be discarded when there is a lot of traffic on the network.
Since you're referring to "terminals" they might not have a
lot of memory for network buffers and therefore they might
drop packets more often than "real" computers.

There is also a possibility that the number of hosts tested
is overflowing the ARP cache table on the Hobbit server,
which can lead to packets being lost.

I'd suggest starting with some extra options for bbtest-net (in
hobbitlaunch.cfg): Add --concurrency=32 to the bbtest-net
command, and change the FPING setting in hobbitserver.cfg to
FPING="fping -i150", this will increase the time fping waits
between sending packets from 25 ms to 150 ms, so there is
less ICMP traffic on the network.

You can replicate how Hobbit performs the ping test by
putting the IP's of all your hosts into a text file (one IP
per line), then run
    fping -Ae </tmp/IPlist.txt


Regards,
Henrik

-- 
Josh Luthman
Office: XXX-XXX-XXXX
Direct: XXX-XXX-XXXX
XXXX Wayne St
Suite XXXX
Troy, OH XXXXX

Those who don't understand UNIX are condemned to reinvent it, poorly.
--- Henry Spencer