Xymon Mailing List Archive search

Mysterious Sawtooth Graphs

8 messages in this thread

list Thorsten Erdmann · Wed, 12 Aug 2009 13:21:16 +0200 ·
Hi

I use Hobbit to monitor about 700 systems. I get some mysterious looking graphs with the CONN test and also the bbgen test itself.
It looks like two overlayed sawtooth curves. Any idea why the graphs look so weird? I cannot believe these are the real response times.

Here are two demo pics:

http://www.trektech.de/test/hobbitgraph_conn.png
http://www.trektech.de/test/hobbitgraph_bbtest.png

BTW.: is there a way to speed up the connect test. It needs about 35sec which is not critical but not very fast.

Thorsten Erdmann
ITI/EP68
Mercedes Benz Werk Hamburg
Tel.: +XX-XX-XXXX-XXXX
mobil: +XX-XXX-XXXXXXX
Lotus-Fax:+XX-XXX-XXXXXXXXXX
mail: user-9219fb9415b1@xymon.invalid


If you are not the intended addressee, please inform us immediately that you have received this e-mail in error, and delete it. We thank you for your cooperation.
list Stewart L · Wed, 12 Aug 2009 07:45:35 -0400 ·
We get them as well.  Not sure why.
quoted from Thorsten Erdmann

On Wed, Aug 12, 2009 at 7:21 AM, <user-9219fb9415b1@xymon.invalid> wrote:
Hi

I use Hobbit to monitor about 700 systems. I get some mysterious looking
graphs with the CONN test and also the bbgen test itself.
It looks like two overlayed sawtooth curves. Any idea why the graphs look
so weird? I cannot believe these are the real response times.

Here are two demo pics:

http://www.trektech.de/test/hobbitgraph_conn.png
http://www.trektech.de/test/hobbitgraph_bbtest.png

BTW.: is there a way to speed up the connect test. It needs about 35sec
which is not critical but not very fast.

Thorsten Erdmann
ITI/EP68
Mercedes Benz Werk Hamburg
Tel.: +XX-XX-XXXX-XXXX
mobil: +XX-XXX-XXXXXXX
Lotus-Fax:+XX-XXX-XXXXXXXXXX
mail: user-9219fb9415b1@xymon.invalid

If you are not the intended addressee, please inform us immediately that
you have received this e-mail in error, and delete it. We thank you for your
cooperation.
-- 

Stewart
--
An infinite number of mathematicians walk into a bar. The first one orders a
beer. The second orders half a beer. The third, a quarter of a beer. The
bartender says "You're all idiots", and pours two beers.
list Alan Sparks · Wed, 12 Aug 2009 09:50:14 -0600 ·
quoted from Stewart L
user-9219fb9415b1@xymon.invalid wrote:
Hi

I use Hobbit to monitor about 700 systems. I get some mysterious
looking graphs with the CONN test and also the bbgen test itself.
It looks like two overlayed sawtooth curves. Any idea why the graphs
look so weird? I cannot believe these are the real response times.

Here are two demo pics:

http://www.trektech.de/test/hobbitgraph_conn.png
http://www.trektech.de/test/hobbitgraph_bbtest.png

BTW.: is there a way to speed up the connect test. It needs about
35sec which is not critical but not very fast.
I've seen them too.  Even on HTTP test graphs.  And also not entirely
sure why.

As far as the ping bounces around 20-40ms are concerned, this is a
problem with hobbitping, and its polling algorithm.  If you want the
timing correct, I highly suggest you switch to fping.  Install fping and
change the FPING setting in hobbitserver.cfg:
FPING="/usr/sbin/fping"

Note: I've used the following on mine with good luck, for several
thousand machines:
FPING="/usr/sbin/fping -i10 -t1500 -r2"

Note you must make sure the xymon or hobbit user has rights to run
fping, either by a sudo arrangement or by setting up a setuid capability
such as:
chmod g+x /usr/sbin/fping
chgrp xymon /usr/sbin/fping
chmod u+s /usr/sbin/fping

Not sure about your statement regarding connect tests running 35
seconds. You mean the ping or tcp test times listed in the bbtest info
page? Or the test setup or DNS resolve times? For 700 hosts, 35 seconds
isn't too bad, I've seen 4000 hosts or so run in maybe 110-120 seconds.
-Alan
list Joe Sloan · Wed, 12 Aug 2009 13:44:44 -0700 ·
For what it's worth I've been seeing them too - I thought it was an oddity of our local network.

Joe
quoted from Alan Sparks

user-9219fb9415b1@xymon.invalid wrote:
Hi

I use Hobbit to monitor about 700 systems. I get some mysterious looking graphs with the CONN test and also the bbgen test itself.
It looks like two overlayed sawtooth curves. Any idea why the graphs look so weird? I cannot believe these are the real response times.

Here are two demo pics:

http://www.trektech.de/test/hobbitgraph_conn.png
http://www.trektech.de/test/hobbitgraph_bbtest.png

BTW.: is there a way to speed up the connect test. It needs about 35sec which is not critical but not very fast.

Thorsten Erdmann
ITI/EP68
Mercedes Benz Werk Hamburg
Tel.: +XX-XX-XXXX-XXXX
mobil: +XX-XXX-XXXXXXX
Lotus-Fax:+XX-XXX-XXXXXXXXXX
mail: user-9219fb9415b1@xymon.invalid

If you are not the intended addressee, please inform us immediately that you have received this e-mail in error, and delete it. We thank you for your cooperation.
list Josh Luthman · Wed, 12 Aug 2009 16:59:37 -0400 ·
I see then very often with less then 1ms pings.
quoted from Joe Sloan

On 8/12/09, Joe <user-b1d2c84d244b@xymon.invalid> wrote:
For what it's worth I've been seeing them too - I thought it was an
oddity of our local network.

Joe

user-9219fb9415b1@xymon.invalid wrote:
Hi

I use Hobbit to monitor about 700 systems. I get some mysterious
looking graphs with the CONN test and also the bbgen test itself.
It looks like two overlayed sawtooth curves. Any idea why the graphs
look so weird? I cannot believe these are the real response times.

Here are two demo pics:

http://www.trektech.de/test/hobbitgraph_conn.png
http://www.trektech.de/test/hobbitgraph_bbtest.png

BTW.: is there a way to speed up the connect test. It needs about
35sec which is not critical but not very fast.

Thorsten Erdmann
ITI/EP68
Mercedes Benz Werk Hamburg
Tel.: +XX-XX-XXXX-XXXX
mobil: +XX-XXX-XXXXXXX
Lotus-Fax:+XX-XXX-XXXXXXXXXX
mail: user-9219fb9415b1@xymon.invalid

If you are not the intended addressee, please inform us immediately
that you have received this e-mail in error, and delete it. We thank
you for your cooperation.
-- 

Josh Luthman
Office: XXX-XXX-XXXX
Direct: XXX-XXX-XXXX
XXXX Wayne St
Suite XXXX
Troy, OH XXXXX

"When you have eliminated the impossible, that which remains, however
improbable, must be the truth."
--- Sir Arthur Conan Doyle
list Mv652 · Thu, 13 Aug 2009 01:16:11 -0600 ·
Just to add to the other confirmations... 

We've noticed this behaviour too (also using fping) after moving to 4.3.0.0 beta2 (clean install). 
We don't have very many hosts at the moment, but most of our hosts are multi-homed.
The monitoring server has a separate network link for each VLAN/network, so no routing is involved for each connection test. 

We thought the problem may be one or two faulty nic's on the monitoring server, but we get the 'sawtooth' behaviour with some systems but not others within the same VLAN/network. 

So far we haven't found a solution though. 
Regards,
Mario 
quoted from Josh Luthman
user-9219fb9415b1@xymon.invalid writes: 
Hi 
I use Hobbit to monitor about 700 systems. I get some mysterious looking graphs with the CONN test and also the bbgen test itself.
It looks like two overlayed sawtooth curves. Any idea why the graphs look so weird? I cannot believe these are the real response times. 
Here are two demo pics: 
http://www.trektech.de/test/hobbitgraph_conn.png
http://www.trektech.de/test/hobbitgraph_bbtest.png 
BTW.: is there a way to speed up the connect test. It needs about 35sec which is not critical but not very fast. 
Thorsten Erdmann
ITI/EP68
Mercedes Benz Werk Hamburg
Tel.: +XX-XX-XXXX-XXXX
mobil: +XX-XXX-XXXXXXX
Lotus-Fax:+XX-XXX-XXXXXXXXXX
mail: user-9219fb9415b1@xymon.invalid 

If you are not the intended addressee, please inform us immediately that you have received this e-mail in error, and delete it. We thank you for your cooperation.
list Thorsten Erdmann · Thu, 13 Aug 2009 09:28:30 +0200 ·
quoted from Alan Sparks
I've seen them too.  Even on HTTP test graphs.  And also not entirely
sure why.

As far as the ping bounces around 20-40ms are concerned, this is a
problem with hobbitping, and its polling algorithm.  If you want the
timing correct, I highly suggest you switch to fping.  Install fping and
change the FPING setting in hobbitserver.cfg:
FPING="/usr/sbin/fping"
I will try that out. I remember I had problems using fping and switched to hobbitping therefore.
quoted from Alan Sparks
Note you must make sure the xymon or hobbit user has rights to run
fping, either by a sudo arrangement or by setting up a setuid capability
such as:
chmod g+x /usr/sbin/fping
chgrp xymon /usr/sbin/fping
chmod u+s /usr/sbin/fping
Maybe that was my fping problem. :-)
quoted from Alan Sparks
Not sure about your statement regarding connect tests running 35
seconds. You mean the ping or tcp test times listed in the bbtest info
page?
I mean the Ping test times in the bbtest page:

TIME SPENT
Event                                            Starttime Duration
bbtest-net startup                       1250148155.074249 -
Service definitions loaded               1250148155.075870 0.001621 Tests loaded                             1250148155.103019 0.027149 DNS lookups completed                    1250148155.105950 0.002931 Test engine setup completed              1250148155.109315 0.003365 TCP tests completed                      1250148155.110463 0.001148 PING test completed (651 hosts)          1250148187.146047 32.035584 <--- This one
PING test results sent                   1250148187.283501 0.137454 Test result collection completed         1250148187.283511 0.000010 LDAP test engine setup completed         1250148187.283513 0.000002 LDAP tests executed                      1250148187.283522 0.000009 LDAP tests result collection completed   1250148187.283523 0.000001 NSLOOKUP tests executed                  1250148187.287590 0.004067 Test results transmitted                 1250148187.360348 0.072758 bbtest-net completed                     1250148187.362335 0.001987 TIME TOTAL 32.288086 

Bye
Thorsten
quoted from Mv652


If you are not the intended addressee, please inform us immediately that you have received this e-mail in error, and delete it. We thank you for your cooperation.
list Buchan Milne · Thu, 13 Aug 2009 16:55:15 +0100 ·
quoted from Mv652
On Wednesday, 12 August 2009 12:21:16 user-9219fb9415b1@xymon.invalid wrote:
Hi

I use Hobbit to monitor about 700 systems. I get some mysterious looking
graphs with the CONN test and also the bbgen test itself.
It looks like two overlayed sawtooth curves. Any idea why the graphs look
so weird? I cannot believe these are the real response times.
Are you using hobbitping, or fping? If you are using hobbitping, that would explain your high ping times (20 ms+). You should install fping, and set the FPING variable in hobbitserver.cfg to a name that will find fping (either full path, or just name if it is in the path).
quoted from Mv652
Here are two demo pics:

http://www.trektech.de/test/hobbitgraph_conn.png
http://www.trektech.de/test/hobbitgraph_bbtest.png

BTW.: is there a way to speed up the connect test. It needs about 35sec
which is not critical but not very fast.
How many devices are you running network tests, and at what intervals do you want the network tests to be run? I've usually only run network tests at intervals of 1min, higher frequency doesn't hold much benefit IMHO.

Regards,
Buchan