Xymon Mailing List Archive search

Network test time-outs

3 messages in this thread

list James B Horwath · Wed, 25 Jan 2006 10:06:40 -0500 ·
I am having an issue with Hobbit network tests.  There are times when network or (client) system congestion causes a network test such as SSH to fail.  The service is not down just slow, I receive a clear almost immediately after I receive the down.  To keep my sanity I need the network value upped.  In BB I was able to correct this by setting the variables BBNETTIMER[1-3] to higher values.  I wasn't able to find such a beast in Hobbit.  I opened the file bbtest-net.c and found the int             timeout=10;                     /* The timeout (seconds) for all TCP-tests */
and
int             extcmdtimeout = 30;

I do not use ICMP as a connectivity test, I use ssh for up/down, to me that is more realistic.   I searched the mail archives and didn't find anything. Can I up the values on the time-out to make the test not fail so quickly?  Or anybody else have any suggestions on how to work around the issue.

Please advise,
Jim


This message, and any attachments to it, may contain information
that is privileged, confidential, and exempt from disclosure under
applicable law.  If the reader of this message is not the intended
recipient, you are notified that any use, dissemination,
distribution, copying, or communication of this message is strictly
prohibited.  If you have received this message in error, please
notify the sender immediately by return e-mail and delete the
message and any attachments.  Thank you.
list Larry Barber · Wed, 25 Jan 2006 09:18:25 -0600 ·
You can increase the timeout value on the bbnet-test program by adding a
-timeout=N to its invocation line in hobbitalerts.cfg. I've found just
increasing it to 15 seconds (from the default 10) makes a huge difference in
the number of timeouts I get.

Thanks,
Larry Barber
quoted from James B Horwath

On 1/25/06, James B Horwath <user-9b4f5b722116@xymon.invalid> wrote:

I am having an issue with Hobbit network tests.  There are times when
network or (client) system congestion causes a network test such as SSH to
fail.  The service is not down just slow, I receive a clear almost
immediately after I receive the down.  To keep my sanity I need the network
value upped.  In BB I was able to correct this by setting the variables
BBNETTIMER[1-3] to higher values.  I wasn't able to find such a beast in
Hobbit.  I opened the file bbtest-net.c and found the
int             timeout=10;                     /* The timeout (seconds)
for all TCP-tests */
and
int             extcmdtimeout = 30;

I do not use ICMP as a connectivity test, I use ssh for up/down, to me
that is more realistic.   I searched the mail archives and didn't find
anything. Can I up the values on the time-out to make the test not fail so
quickly?  Or anybody else have any suggestions on how to work around the
issue.

Please advise,
Jim


* This message, and any attachments to it, may contain information
that is privileged, confidential, and exempt from disclosure under
applicable law. If the reader of this message is not the intended
recipient, you are notified that any use, dissemination,
distribution, copying, or communication of this message is strictly
prohibited. If you have received this message in error, please
notify the sender immediately by return e-mail and delete the
message and any attachments. Thank you. *
list Olivier Beau · Thu, 26 Jan 2006 08:14:57 +0100 ·
Hi Henrik,

Lets say, we have a 20min network problem between the hobbitagent and the
hobbitserver. This means were loosing ~15min of statuses from the agent (with a
5 minute pool time..). Correct ?

Could it be possible of have a small queue (kind of a bbproxy) on the agent that
could hold the statuses for some time ? How would the hobbitserver react
receiving theses old events ?. 
Dont know if these could be possible, but seems great to me for doing
maintenance on the hobbitserver..;)


Olivier