Xymon Mailing List Archive search

DS override - can't get to work

9 messages in this thread

list John Horne · Fri, 17 May 2019 12:34:42 +0000 ·
Hello,

Using the Terabithia RPMs, I've been looking at adding a 'DS' override entry to
our analysis.cfg file.

I have an entry for a web server as:

  DS  http  tcp.http.https:,,weed.plymouth.ac.uk,.rrd:sec  >0.0007 COLOR=yellow

The comparison value (0.0007) was set low just to see that this was working.
However, I can't get it to work. The 'http' column remains green, and the text
shown of the http response (on the web page) indicates that the time is above
the threshold value (e.g. 'Seconds: 0.088597000').

Anyone any ideas about this?


Thanks,

John.

--
John Horne | Senior Operations Analyst | Technology and Information Services
University of Plymouth | Drake Circus | Plymouth | Devon | PL4 8AA | UK
[http://www.plymouth.ac.uk/images/email_footer.gif]<http://www.plymouth.ac.uk/worldclass>;

This email and any files with it are confidential and intended solely for the use of the recipient to whom it is addressed. If you are not the intended recipient then copying, distribution or other use of the information contained is strictly prohibited and you should not rely on it. If you have received this email in error please let the sender know immediately and delete it from your system(s). Internet emails are not necessarily secure. While we take every care, University of Plymouth accepts no responsibility for viruses and it is your responsibility to scan emails and their attachments. University of Plymouth does not accept responsibility for any changes made after it was sent. Nothing in this email or its attachments constitutes an order for goods or services unless accompanied by an official order form.
list John Horne · Fri, 17 May 2019 13:23:13 +0000 ·
quoted from John Horne
On Fri, 2019-05-17 at 12:34 +0000, John Horne wrote:
Using the Terabithia RPMs, I've been looking at adding a 'DS' override entry
to our analysis.cfg file.

I have an entry for a web server as:

  DS  http  tcp.http.https:,,weed.plymouth.ac.uk,.rrd:sec  >0.0007
COLOR=yellow
I checked the rrd file and it is of type 'GAUGE', and 'sec' is the correct DS
section name.

I tried setting the rule to '>0.0' but it still showed a green status.

I also tried creating a symlink from the rrd file to a more simpler name
('jh.rrd -> tcp.http...') - still showed green.

I have also run xymonnet with the '--debug' option, but can see nothing about
the DS entry or checking the rule to determine the overall colour. I can see
the http test colour being determined, but nothing about the DS entry.

I'm wondering if the fact that we are using
'httpstatus;httpsch://weed.plymouth.ac.uk/;"^[23]";"^[^23]"' for the test is
causing a problem.

If I can I'll take a look at the code later on, but to be honest I'm a bit
stumped by this.
quoted from John Horne


John.

--
John Horne | Senior Operations Analyst | Technology and Information Services
University of Plymouth | Drake Circus | Plymouth | Devon | PL4 8AA | UK
[http://www.plymouth.ac.uk/images/email_footer.gif]<http://www.plymouth.ac.uk/worldclass>;

This email and any files with it are confidential and intended solely for the use of the recipient to whom it is addressed. If you are not the intended recipient then copying, distribution or other use of the information contained is strictly prohibited and you should not rely on it. If you have received this email in error please let the sender know immediately and delete it from your system(s). Internet emails are not necessarily secure. While we take every care, University of Plymouth accepts no responsibility for viruses and it is your responsibility to scan emails and their attachments. University of Plymouth does not accept responsibility for any changes made after it was sent. Nothing in this email or its attachments constitutes an order for goods or services unless accompanied by an official order form.
list John Horne · Fri, 17 May 2019 14:59:00 +0000 ·
quoted from John Horne
On Fri, 2019-05-17 at 13:23 +0000, John Horne wrote:
I'm wondering if the fact that we are using
'httpstatus;httpsch://weed.plymouth.ac.uk/;"^[23]";"^[^23]"' for the test is
causing a problem.

If I can I'll take a look at the code later on, but to be honest I'm a bit
stumped by this.
I changed the httpstatus check to just 'httpstatus;http://...';. I adjusted the
filename in the analysis.cfg file too. Seems to have made no difference though.

I ran xymond with the '--debug' option, but again could not see anything
relevant about the DS.

Now trying to find out where all this happens in the code... :-(
quoted from John Horne


John.

--
John Horne | Senior Operations Analyst | Technology and Information Services
University of Plymouth | Drake Circus | Plymouth | Devon | PL4 8AA | UK
[http://www.plymouth.ac.uk/images/email_footer.gif]<http://www.plymouth.ac.uk/worldclass>;

This email and any files with it are confidential and intended solely for the use of the recipient to whom it is addressed. If you are not the intended recipient then copying, distribution or other use of the information contained is strictly prohibited and you should not rely on it. If you have received this email in error please let the sender know immediately and delete it from your system(s). Internet emails are not necessarily secure. While we take every care, University of Plymouth accepts no responsibility for viruses and it is your responsibility to scan emails and their attachments. University of Plymouth does not accept responsibility for any changes made after it was sent. Nothing in this email or its attachments constitutes an order for goods or services unless accompanied by an official order form.
list Japheth Cleaver · Fri, 17 May 2019 08:26:26 -0700 ·
quoted from John Horne
On 5/17/2019 7:59 AM, John Horne wrote:
On Fri, 2019-05-17 at 13:23 +0000, John Horne wrote:
I'm wondering if the fact that we are using
'httpstatus;httpsch://weed.plymouth.ac.uk/;"^[23]";"^[^23]"' for the test is
causing a problem.

If I can I'll take a look at the code later on, but to be honest I'm a bit
stumped by this.
I changed the httpstatus check to just 'httpstatus;http://...';. I adjusted the
filename in the analysis.cfg file too. Seems to have made no difference though.

I ran xymond with the '--debug' option, but again could not see anything
relevant about the DS.

Now trying to find out where all this happens in the code... :-(
Can you run xymond_rrd with --debug mode? Specifically, the one reading from the "status" channel.

This is responsible for taking the incoming data point and turning it into a "modify" message, so if there's a parsing problem it should show up either at the time of the http message receipt or on initial load as it's importing the rules to begin with.

If a "modify" message is properly being sent out, then it's possible the host+svc combination it's being tagged with is incorrect.


-jc
list John Horne · Fri, 17 May 2019 15:44:41 +0000 ·
quoted from John Horne
On Fri, 2019-05-17 at 14:59 +0000, John Horne wrote:
On Fri, 2019-05-17 at 13:23 +0000, John Horne wrote:
I'm wondering if the fact that we are using
'httpstatus;httpsch://weed.plymouth.ac.uk/;"^[23]";"^[^23]"' for the test
is causing a problem.
Okay, so I tried using DS with the 'conn' test, and that worked fine (the page
went yellow).

Slightly worrying is that the default message which shows the rule to be used
is restricted to only 2 decimal places. The code reads the value as a 'double',
but just displays it to 2 places. (So, in my case using a rule of '>0.0007' it
showed on the web page as '>0.00'. Using a rule of '0.007' (one less 0), this
was shown as '>0.01' - so it rounded it up.)
A minor point probably, just a little confusing when trying to force a result
using a very small value.
quoted from John Horne


John.

--
John Horne | Senior Operations Analyst | Technology and Information Services
University of Plymouth | Drake Circus | Plymouth | Devon | PL4 8AA | UK
[http://www.plymouth.ac.uk/images/email_footer.gif]<http://www.plymouth.ac.uk/worldclass>;

This email and any files with it are confidential and intended solely for the use of the recipient to whom it is addressed. If you are not the intended recipient then copying, distribution or other use of the information contained is strictly prohibited and you should not rely on it. If you have received this email in error please let the sender know immediately and delete it from your system(s). Internet emails are not necessarily secure. While we take every care, University of Plymouth accepts no responsibility for viruses and it is your responsibility to scan emails and their attachments. University of Plymouth does not accept responsibility for any changes made after it was sent. Nothing in this email or its attachments constitutes an order for goods or services unless accompanied by an official order form.
list John Horne · Fri, 17 May 2019 16:17:30 +0000 ·
quoted from Japheth Cleaver
On Fri, 2019-05-17 at 08:26 -0700, Japheth Cleaver wrote:
On 5/17/2019 7:59 AM, John Horne wrote:
On Fri, 2019-05-17 at 13:23 +0000, John Horne wrote:
I'm wondering if the fact that we are using
'httpstatus;httpsch://weed.plymouth.ac.uk/;"^[23]";"^[^23]"' for the test
is causing a problem.
Can you run xymond_rrd with --debug mode? Specifically, the one reading
from the "status" channel.
Okay, done that. All that I can see though are entries such as this:

===========
73838 2019-05-17 16:48:39.732827 xymond_rrd: Got message 744
@@status#744/weed|1558108119.732746|10.120.16.9||weed|http|1558109019|green||gr
een|1558108057|0||0||1558108102|linux||0|

73838 2019-05-17 16:48:39.732830 startpos 275616, fillpos 275616, endpos -1

73838 2019-05-17 16:48:39.732846  -
/weed/tcp.http.https:,,weed.plymouth.ac.uk,.rrd: storing 15 bytes into seq 744
(pos: 1/23), at 1558108119: 1558108119:0.08
===========

which looks like it is just updating the RRD file.
No mention of 'anything being modified (or rather 'modify') at all.

Starting xymon seems to show no problems either. Entries seen such as:

===========
85455 2019-05-17 17:04:33.658972  loadhostnames:checking if this host weed has
been defined before...
85455 2019-05-17 17:04:33.658975  loadhostnames:adding host weed as a new
item... = 0x55b1f26cd920
85455 2019-05-17 17:04:33.658979  loadhostnames:build_hosttree - status for
that add was 0

...

85455 2019-05-17 17:04:33.660025  loadhosts:build_hosttree - walk->clientname
for weed is: weed
85455 2019-05-17 17:04:33.660028  loadhosts:build_hosttree - xtreeAdd to
rbclients for weed at 0x55b1f26cd920
85455 2019-05-17 17:04:33.660032  loadhosts:build_hosttree - status for that
add was 0
quoted from John Horne
===========


John.

--
John Horne | Senior Operations Analyst | Technology and Information Services
University of Plymouth | Drake Circus | Plymouth | Devon | PL4 8AA | UK
[http://www.plymouth.ac.uk/images/email_footer.gif]<http://www.plymouth.ac.uk/worldclass>;

This email and any files with it are confidential and intended solely for the use of the recipient to whom it is addressed. If you are not the intended recipient then copying, distribution or other use of the information contained is strictly prohibited and you should not rely on it. If you have received this email in error please let the sender know immediately and delete it from your system(s). Internet emails are not necessarily secure. While we take every care, University of Plymouth accepts no responsibility for viruses and it is your responsibility to scan emails and their attachments. University of Plymouth does not accept responsibility for any changes made after it was sent. Nothing in this email or its attachments constitutes an order for goods or services unless accompanied by an official order form.
list John Horne · Fri, 17 May 2019 16:22:04 +0000 ·
quoted from John Horne
On Fri, 2019-05-17 at 15:44 +0000, John Horne wrote:
On Fri, 2019-05-17 at 14:59 +0000, John Horne wrote:
On Fri, 2019-05-17 at 13:23 +0000, John Horne wrote:
I'm wondering if the fact that we are using
'httpstatus;httpsch://weed.plymouth.ac.uk/;"^[23]";"^[^23]"' for the test
is causing a problem.
Okay, so I tried using DS with the 'conn' test, and that worked fine (the
page went yellow).
I decided to change the HTTP test DS entry to use a regex for the filename, and
that worked!

If I use 'DS  http  %^tcp\.http\.https.*weed.*\.rrd:sec >0.0007 COLOR=yellow'
then that works fine.

I'll see if I can work backwards to find out what in the original filename is
causing the problem.
quoted from John Horne


John.

--
John Horne | Senior Operations Analyst | Technology and Information Services
University of Plymouth | Drake Circus | Plymouth | Devon | PL4 8AA | UK
[http://www.plymouth.ac.uk/images/email_footer.gif]<http://www.plymouth.ac.uk/worldclass>;

This email and any files with it are confidential and intended solely for the use of the recipient to whom it is addressed. If you are not the intended recipient then copying, distribution or other use of the information contained is strictly prohibited and you should not rely on it. If you have received this email in error please let the sender know immediately and delete it from your system(s). Internet emails are not necessarily secure. While we take every care, University of Plymouth accepts no responsibility for viruses and it is your responsibility to scan emails and their attachments. University of Plymouth does not accept responsibility for any changes made after it was sent. Nothing in this email or its attachments constitutes an order for goods or services unless accompanied by an official order form.
list John Horne · Fri, 17 May 2019 17:20:02 +0000 ·
quoted from John Horne
On Fri, 2019-05-17 at 16:22 +0000, John Horne wrote:
On Fri, 2019-05-17 at 15:44 +0000, John Horne wrote:
On Fri, 2019-05-17 at 14:59 +0000, John Horne wrote:
On Fri, 2019-05-17 at 13:23 +0000, John Horne wrote:
I'm wondering if the fact that we are using
'httpstatus;httpsch://weed.plymouth.ac.uk/;"^[23]";"^[^23]"' for the
test is causing a problem.
Okay, so I tried using DS with the 'conn' test, and that worked fine (the
page went yellow).
I decided to change the HTTP test DS entry to use a regex for the filename,
and that worked!

If I use 'DS  http  %^tcp\.http\.https.*weed.*\.rrd:sec >0.0007 COLOR=yellow'
then that works fine.

I'll see if I can work backwards to find out what in the original filename is
causing the problem.
The only way I can get this test to work is by prefixing the filename with a
'%' in order to make it a regex.
By trial and error, and not as a regex, I have tried escaping the colon and
comma characters. I have tried including the whole of the filename in single
quotes and double quotes, and then repeated that on literally just the filename
part.
All of these failed.

Oddly I repeated the DS settings on a different client server (same xymon
server), and noticed that the RRD filename was different.
In the hosts.cfg file, if I use 'httpstatus;http://x1...'; then the filename
produced is 'tcp.http.x1...'.
But if I use 'httpstatus;https://x2...'; then the filename becomes
'tcp.http.https:,,x2...'. The '://' part of the URL is now included in the
filename (and commafied(!)).

Anyway, it's back to the code I guess.
quoted from John Horne


John.

--
John Horne | Senior Operations Analyst | Technology and Information Services
University of Plymouth | Drake Circus | Plymouth | Devon | PL4 8AA | UK
[http://www.plymouth.ac.uk/images/email_footer.gif]<http://www.plymouth.ac.uk/worldclass>;

This email and any files with it are confidential and intended solely for the use of the recipient to whom it is addressed. If you are not the intended recipient then copying, distribution or other use of the information contained is strictly prohibited and you should not rely on it. If you have received this email in error please let the sender know immediately and delete it from your system(s). Internet emails are not necessarily secure. While we take every care, University of Plymouth accepts no responsibility for viruses and it is your responsibility to scan emails and their attachments. University of Plymouth does not accept responsibility for any changes made after it was sent. Nothing in this email or its attachments constitutes an order for goods or services unless accompanied by an official order form.
list John Horne · Tue, 21 May 2019 10:57:49 +0000 ·
quoted from John Horne
On Fri, 2019-05-17 at 12:34 +0000, John Horne wrote:
Hello,

Using the Terabithia RPMs, I've been looking at adding a 'DS' override entry
to our analysis.cfg file.

I have an entry for a web server as:

  DS  http  tcp.http.https:,,weed.plymouth.ac.uk,.rrd:sec  >0.0007
COLOR=yellow

The comparison value (0.0007) was set low just to see that this was working.
However, I can't get it to work. The 'http' column remains green, and the
text shown of the http response (on the web page) indicates that the time is
above the threshold value (e.g. 'Seconds: 0.088597000').
Hello,

Okay, the problem seems to be with the commas in the filename. Within the file
'xymond/client_config.c' exists the function 'check_rrdds_thresholds'. This is
used to check the DS rules, and checks the RRD filename against the one in the
rule. It does this by calling the function 'namematch' (found in
'lib/matching.c').

However, 'namematch' splits up the filename into tokens based on the comma
character. (Not sure why it does this, but I assume the function is used
elsewhere and makes sense in those cases.) As such, the filename never matches
if it contains a comma character (and mine has several of them).

A short context diff patch is attached, but I'm not sure it's the best.
Basically for the 'http' test don't call 'namematch' but call 'patternmatch'
instead. This is a similar function (also in matching.c), but doesn't split the
filename into tokens. It is usually used for matching substrings, but will work
when comparing two full filenames. It also takes care of regex rules. Using it
locally the patch seems to work fine.

The patch restricts calling 'patternmatch' only if the DS column name is
'http'. As far as I can tell this is the only time it is required. So it should
have no effect if DS is used elsewhere. However, others may be using DS with
other column names (I'm thinking 'apache' here), but which also use a filename
with commas. It may be that in those cases the bug will reappear unless the
column name is also tested for 'apache'.

As always, feel free to modify, or even reject, the patch.


As an addition, I did get the reported limits (&L and &U) reported with the
precision (significant digits) as used by the user in the analysis.cfg file.
So, in my case, '0.0007' has a precision of 4 set. The default text output when
using DS then uses this precision for all values. (It also took care of when
exponentiation was used, no leading zeros etc.) This all worked well. However,
the value reported for the RRD value (&V) seemed likewise restricted to 2
significant digits. I was loath to change this as it may well break things, and
seemed to be similarly restricted (to 2 digits) in more than one place. This
then meant that the output would generally look okay except for the reported
RRD value. As such I aborted that patch.
quoted from John Horne


John.

--
John Horne | Senior Operations Analyst | Technology and Information Services
University of Plymouth | Drake Circus | Plymouth | Devon | PL4 8AA | UK
[http://www.plymouth.ac.uk/images/email_footer.gif]<http://www.plymouth.ac.uk/worldclass>;

This email and any files with it are confidential and intended solely for the use of the recipient to whom it is addressed. If you are not the intended recipient then copying, distribution or other use of the information contained is strictly prohibited and you should not rely on it. If you have received this email in error please let the sender know immediately and delete it from your system(s). Internet emails are not necessarily secure. While we take every care, University of Plymouth accepts no responsibility for viruses and it is your responsibility to scan emails and their attachments. University of Plymouth does not accept responsibility for any changes made after it was sent. Nothing in this email or its attachments constitutes an order for goods or services unless accompanied by an official order form.