Xymon Mailing List Archive search

Xymon 4.3.2 http flapping with SSL error

4 messages in this thread

list Phil Meech · Thu, 12 May 2011 11:01:14 +0100 ·
Hi All,

I seem to have a bit of a problem with a large amount of devices
reporting http flapping.  I am monitoring in hosts.cfg
https://[ip_address]:444.  The SSL column always remains green, but
the http column flaps with the status message:

https://[ip_address]:444/ - SSL error

Seconds:     5.08

Is there anyway the time that the http test awaits a response for can
be increased, or is there any other way to prevent the flapping/red
status?  I have 90 such devices that consume a lot of RAM and the CPU
regularly goes up to 90%+ (I collect SNMP stats via devmon so the
flapping here can be controlled with thresholds).  I presume this is
why the http response can be slow, though it would normally return in
8s.  Hence the question, can the http reponse time to Xymon be
increased?

Cheers,
Phil
list Phil Meech · Mon, 23 May 2011 14:00:10 +0100 ·
Hi All,

OK I realise now that this doesn't have anything to do with the https
test time; I've tried increasing the the test time directly via
xymonnet --timeout; and the test still flaps with the same frequency.
The main issue seems to be the somewhat non-descriptive "SSL error"
message.  The sslcert check does not go red as the http check fails.
The http check fails at different times; some hosts go red after 5
seconds, others up to 9 seconds.  I've checked through all the xymon
logs and cannot find anything relating to this error.  Anyone got any
ideas?

Cheers,
Phil

---------- Forwarded message ----------
quoted from Phil Meech
From: Phil Meech <user-472323a743c7@xymon.invalid>
Date: 12 May 2011 11:01
Subject: Xymon 4.3.2 http flapping with SSL error
To: xymon at xymon.com


Hi All,

I seem to have a bit of a problem with a large amount of devices
reporting http flapping.  I am monitoring in hosts.cfg
https://[ip_address]:444.  The SSL column always remains green, but
the http column flaps with the status message:

https://[ip_address]:444/ - SSL error

Seconds:     5.08

Is there anyway the time that the http test awaits a response for can
be increased, or is there any other way to prevent the flapping/red
status?  I have 90 such devices that consume a lot of RAM and the CPU
regularly goes up to 90%+ (I collect SNMP stats via devmon so the
flapping here can be controlled with thresholds).  I presume this is
why the http response can be slow, though it would normally return in
8s.  Hence the question, can the http reponse time to Xymon be
increased?

Cheers,
Phil
list Henrik Størner · Mon, 23 May 2011 16:23:49 +0200 ·
quoted from Phil Meech
I seem to have a bit of a problem with a large amount of devices
reporting http flapping.  I am monitoring in hosts.cfg
https://[ip_address]:444.  The SSL column always remains green, but
the http column flaps with the status message:

https://[ip_address]:444/ - SSL error
This usually means that connecting to the service failed. There might be some additional information in the xymonnet logfile.

The "sslcert" column (I assume that is the one you mean) only reports the status of the SSL certificate; it it cannot connect to the server then there is no certificate to report on, and hence that status is not updated (and will eventually go purple).
quoted from Phil Meech
status?  I have 90 such devices that consume a lot of RAM and the CPU
regularly goes up to 90%+ (I collect SNMP stats via devmon so the
flapping here can be controlled with thresholds).  I presume this is
why the http response can be slow, though it would normally return in
8s.
If increasing the timeout doesn't help, perhaps you can use something like "badhttp:3:3:5" to delay alerting until the test has failed for more than one poll cycle ? This example would cause it to go yellow for 3 poll cycles (15 minutes) and red after 5 cycles (25 minutes).


Regards,
Henrik
list Phil Meech · Mon, 23 May 2011 15:57:36 +0100 ·
Hi Henrick,

Thanks for your input; it has solved my issue!

Cheers,
Phil
quoted from Henrik Størner

On 23 May 2011 15:23, Henrik Størner <user-ce4a2c883f75@xymon.invalid> wrote:
I seem to have a bit of a problem with a large amount of devices
reporting http flapping.  I am monitoring in hosts.cfg
https://[ip_address]:444.  The SSL column always remains green, but
the http column flaps with the status message:

https://[ip_address]:444/ - SSL error
This usually means that connecting to the service failed. There might be
some additional information in the xymonnet logfile.

The "sslcert" column (I assume that is the one you mean) only reports the
status of the SSL certificate; it it cannot connect to the server then there
is no certificate to report on, and hence that status is not updated (and
will eventually go purple).
status?  I have 90 such devices that consume a lot of RAM and the CPU
regularly goes up to 90%+ (I collect SNMP stats via devmon so the
flapping here can be controlled with thresholds).  I presume this is
why the http response can be slow, though it would normally return in
8s.
If increasing the timeout doesn't help, perhaps you can use something like
"badhttp:3:3:5" to delay alerting until the test has failed for more than
one poll cycle ? This example would cause it to go yellow for 3 poll cycles
(15 minutes) and red after 5 cycles (25 minutes).


Regards,
Henrik