Mysterious Sawtooth Graphs
list Thorsten Erdmann
Hi I use Hobbit to monitor about 700 systems. I get some mysterious looking graphs with the CONN test and also the bbgen test itself. It looks like two overlayed sawtooth curves. Any idea why the graphs look so weird? I cannot believe these are the real response times. Here are two demo pics: http://www.trektech.de/test/hobbitgraph_conn.png http://www.trektech.de/test/hobbitgraph_bbtest.png BTW.: is there a way to speed up the connect test. It needs about 35sec which is not critical but not very fast. Thorsten Erdmann ITI/EP68 Mercedes Benz Werk Hamburg Tel.: +XX-XX-XXXX-XXXX mobil: +XX-XXX-XXXXXXX Lotus-Fax:+XX-XXX-XXXXXXXXXX mail: user-9219fb9415b1@xymon.invalid If you are not the intended addressee, please inform us immediately that you have received this e-mail in error, and delete it. We thank you for your cooperation.
list Stewart L
We get them as well. Not sure why.
▸
On Wed, Aug 12, 2009 at 7:21 AM, <user-9219fb9415b1@xymon.invalid> wrote:
Hi I use Hobbit to monitor about 700 systems. I get some mysterious looking graphs with the CONN test and also the bbgen test itself. It looks like two overlayed sawtooth curves. Any idea why the graphs look so weird? I cannot believe these are the real response times. Here are two demo pics: http://www.trektech.de/test/hobbitgraph_conn.png http://www.trektech.de/test/hobbitgraph_bbtest.png BTW.: is there a way to speed up the connect test. It needs about 35sec which is not critical but not very fast. Thorsten Erdmann ITI/EP68 Mercedes Benz Werk Hamburg Tel.: +XX-XX-XXXX-XXXX mobil: +XX-XXX-XXXXXXX Lotus-Fax:+XX-XXX-XXXXXXXXXX mail: user-9219fb9415b1@xymon.invalid If you are not the intended addressee, please inform us immediately that you have received this e-mail in error, and delete it. We thank you for your cooperation.
--
Stewart
--
An infinite number of mathematicians walk into a bar. The first one orders a
beer. The second orders half a beer. The third, a quarter of a beer. The
bartender says "You're all idiots", and pours two beers.
list Alan Sparks
▸
user-9219fb9415b1@xymon.invalid wrote:
Hi I use Hobbit to monitor about 700 systems. I get some mysterious looking graphs with the CONN test and also the bbgen test itself. It looks like two overlayed sawtooth curves. Any idea why the graphs look so weird? I cannot believe these are the real response times. Here are two demo pics: http://www.trektech.de/test/hobbitgraph_conn.png http://www.trektech.de/test/hobbitgraph_bbtest.png BTW.: is there a way to speed up the connect test. It needs about 35sec which is not critical but not very fast.
I've seen them too. Even on HTTP test graphs. And also not entirely sure why. As far as the ping bounces around 20-40ms are concerned, this is a problem with hobbitping, and its polling algorithm. If you want the timing correct, I highly suggest you switch to fping. Install fping and change the FPING setting in hobbitserver.cfg: FPING="/usr/sbin/fping" Note: I've used the following on mine with good luck, for several thousand machines: FPING="/usr/sbin/fping -i10 -t1500 -r2" Note you must make sure the xymon or hobbit user has rights to run fping, either by a sudo arrangement or by setting up a setuid capability such as: chmod g+x /usr/sbin/fping chgrp xymon /usr/sbin/fping chmod u+s /usr/sbin/fping Not sure about your statement regarding connect tests running 35 seconds. You mean the ping or tcp test times listed in the bbtest info page? Or the test setup or DNS resolve times? For 700 hosts, 35 seconds isn't too bad, I've seen 4000 hosts or so run in maybe 110-120 seconds. -Alan
list Joe Sloan
For what it's worth I've been seeing them too - I thought it was an oddity of our local network. Joe
▸
user-9219fb9415b1@xymon.invalid wrote:Hi I use Hobbit to monitor about 700 systems. I get some mysterious looking graphs with the CONN test and also the bbgen test itself. It looks like two overlayed sawtooth curves. Any idea why the graphs look so weird? I cannot believe these are the real response times. Here are two demo pics: http://www.trektech.de/test/hobbitgraph_conn.png http://www.trektech.de/test/hobbitgraph_bbtest.png BTW.: is there a way to speed up the connect test. It needs about 35sec which is not critical but not very fast. Thorsten Erdmann ITI/EP68 Mercedes Benz Werk Hamburg Tel.: +XX-XX-XXXX-XXXX mobil: +XX-XXX-XXXXXXX Lotus-Fax:+XX-XXX-XXXXXXXXXX mail: user-9219fb9415b1@xymon.invalid If you are not the intended addressee, please inform us immediately that you have received this e-mail in error, and delete it. We thank you for your cooperation.
list Josh Luthman
I see then very often with less then 1ms pings.
▸
On 8/12/09, Joe <user-b1d2c84d244b@xymon.invalid> wrote:For what it's worth I've been seeing them too - I thought it was an oddity of our local network. Joe user-9219fb9415b1@xymon.invalid wrote:Hi I use Hobbit to monitor about 700 systems. I get some mysterious looking graphs with the CONN test and also the bbgen test itself. It looks like two overlayed sawtooth curves. Any idea why the graphs look so weird? I cannot believe these are the real response times. Here are two demo pics: http://www.trektech.de/test/hobbitgraph_conn.png http://www.trektech.de/test/hobbitgraph_bbtest.png BTW.: is there a way to speed up the connect test. It needs about 35sec which is not critical but not very fast. Thorsten Erdmann ITI/EP68 Mercedes Benz Werk Hamburg Tel.: +XX-XX-XXXX-XXXX mobil: +XX-XXX-XXXXXXX Lotus-Fax:+XX-XXX-XXXXXXXXXX mail: user-9219fb9415b1@xymon.invalid If you are not the intended addressee, please inform us immediately that you have received this e-mail in error, and delete it. We thank you for your cooperation.
--
Josh Luthman
Office: XXX-XXX-XXXX
Direct: XXX-XXX-XXXX
XXXX Wayne St
Suite XXXX
Troy, OH XXXXX
"When you have eliminated the impossible, that which remains, however
improbable, must be the truth."
--- Sir Arthur Conan Doyle
list Mv652
Just to add to the other confirmations... We've noticed this behaviour too (also using fping) after moving to 4.3.0.0 beta2 (clean install). We don't have very many hosts at the moment, but most of our hosts are multi-homed. The monitoring server has a separate network link for each VLAN/network, so no routing is involved for each connection test. We thought the problem may be one or two faulty nic's on the monitoring server, but we get the 'sawtooth' behaviour with some systems but not others within the same VLAN/network. So far we haven't found a solution though. Regards, Mario
▸
user-9219fb9415b1@xymon.invalid writes: Hi I use Hobbit to monitor about 700 systems. I get some mysterious looking graphs with the CONN test and also the bbgen test itself. It looks like two overlayed sawtooth curves. Any idea why the graphs look so weird? I cannot believe these are the real response times. Here are two demo pics: http://www.trektech.de/test/hobbitgraph_conn.png http://www.trektech.de/test/hobbitgraph_bbtest.png BTW.: is there a way to speed up the connect test. It needs about 35sec which is not critical but not very fast. Thorsten Erdmann ITI/EP68 Mercedes Benz Werk Hamburg Tel.: +XX-XX-XXXX-XXXX mobil: +XX-XXX-XXXXXXX Lotus-Fax:+XX-XXX-XXXXXXXXXX mail: user-9219fb9415b1@xymon.invalid If you are not the intended addressee, please inform us immediately that you have received this e-mail in error, and delete it. We thank you for your cooperation.
list Thorsten Erdmann
▸
I've seen them too. Even on HTTP test graphs. And also not entirely sure why. As far as the ping bounces around 20-40ms are concerned, this is a problem with hobbitping, and its polling algorithm. If you want the timing correct, I highly suggest you switch to fping. Install fping and change the FPING setting in hobbitserver.cfg: FPING="/usr/sbin/fping"
I will try that out. I remember I had problems using fping and switched to hobbitping therefore.
▸
Note you must make sure the xymon or hobbit user has rights to run fping, either by a sudo arrangement or by setting up a setuid capability such as: chmod g+x /usr/sbin/fping chgrp xymon /usr/sbin/fping chmod u+s /usr/sbin/fping
Maybe that was my fping problem. :-)
▸
Not sure about your statement regarding connect tests running 35 seconds. You mean the ping or tcp test times listed in the bbtest info page?
I mean the Ping test times in the bbtest page: TIME SPENT Event Starttime Duration bbtest-net startup 1250148155.074249 - Service definitions loaded 1250148155.075870 0.001621 Tests loaded 1250148155.103019 0.027149 DNS lookups completed 1250148155.105950 0.002931 Test engine setup completed 1250148155.109315 0.003365 TCP tests completed 1250148155.110463 0.001148 PING test completed (651 hosts) 1250148187.146047 32.035584 <--- This one PING test results sent 1250148187.283501 0.137454 Test result collection completed 1250148187.283511 0.000010 LDAP test engine setup completed 1250148187.283513 0.000002 LDAP tests executed 1250148187.283522 0.000009 LDAP tests result collection completed 1250148187.283523 0.000001 NSLOOKUP tests executed 1250148187.287590 0.004067 Test results transmitted 1250148187.360348 0.072758 bbtest-net completed 1250148187.362335 0.001987 TIME TOTAL 32.288086 Bye Thorsten
▸
If you are not the intended addressee, please inform us immediately that you have received this e-mail in error, and delete it. We thank you for your cooperation.
list Buchan Milne
▸
On Wednesday, 12 August 2009 12:21:16 user-9219fb9415b1@xymon.invalid wrote:
Hi I use Hobbit to monitor about 700 systems. I get some mysterious looking graphs with the CONN test and also the bbgen test itself. It looks like two overlayed sawtooth curves. Any idea why the graphs look so weird? I cannot believe these are the real response times.
Are you using hobbitping, or fping? If you are using hobbitping, that would explain your high ping times (20 ms+). You should install fping, and set the FPING variable in hobbitserver.cfg to a name that will find fping (either full path, or just name if it is in the path).
▸
Here are two demo pics: http://www.trektech.de/test/hobbitgraph_conn.png http://www.trektech.de/test/hobbitgraph_bbtest.png BTW.: is there a way to speed up the connect test. It needs about 35sec which is not critical but not very fast.
How many devices are you running network tests, and at what intervals do you want the network tests to be run? I've usually only run network tests at intervals of 1min, higher frequency doesn't hold much benefit IMHO. Regards, Buchan