Xymon Mailing List Archive search

suppress log contents from 'msgs' column ?

9 messages in this thread

list John Thurston · Wed, 13 May 2015 08:43:14 -0800 ·
In testing 4.3.20, I've noticed that the 'msgs' column (for hosts running the xymon solaris client) contain lines from my logs even when they are reporting 'green'. I don't really want the contents of /var/adm/messages leaked to every viewer of my Xymon web interface.

I now see that this behavior is also on my 4.3.17 server. I've just never noticed it because:
A) I never visit any 'green' bubbles
B) I only have three hosts running the xymon client

When the test goes 'red', I can see value in displaying the offending lines from the log, but I can't see any value in continually leaking arbitrary lines of my logs to arbitrary users. Is there an easy way to suppress these log lines from the green status messages?

Argh. I've only now noticed that the 'cpu' column is also spilling all of my running processes. I thought I had taken care of that problem when I suppressed the 'procs' column. Does anyone actually derive value from having this information continually published on the Xymon web interface for all the world to see?
-- 
    Do things because you should, not just because you can.

John Thurston    XXX-XXX-XXXX
user-ce4d79d99bab@xymon.invalid
Enterprise Technology Services
Department of Administration
State of Alaska
list Mark Felder · Wed, 13 May 2015 11:48:38 -0500 ·
quoted from John Thurston

On Wed, May 13, 2015, at 11:43, John Thurston wrote:
In testing 4.3.20, I've noticed that the 'msgs' column (for hosts running the xymon solaris client) contain lines from my logs even when they are reporting 'green'. I don't really want the contents of /var/adm/messages leaked to every viewer of my Xymon web interface.

I now see that this behavior is also on my 4.3.17 server. I've just never noticed it because:
A) I never visit any 'green' bubbles
B) I only have three hosts running the xymon client

When the test goes 'red', I can see value in displaying the offending lines from the log, but I can't see any value in continually leaking arbitrary lines of my logs to arbitrary users. Is there an easy way to suppress these log lines from the green status messages?

Argh. I've only now noticed that the 'cpu' column is also spilling all of my running processes. I thought I had taken care of that problem when I suppressed the 'procs' column. Does anyone actually derive value from having this information continually published on the Xymon web interface for all the world to see?
Is your Xymon server open to the public internet? :-)

cpu: I think it's useful to capture those moments in time especially
when you want to see what changed between green and red.

logs: I agree; it's a waste to capture/display the contents if you're
not matching anything
list John Thurston · Wed, 13 May 2015 09:07:45 -0800 ·
quoted from Mark Felder
On 5/13/2015 8:48 AM, Mark Felder wrote:
On Wed, May 13, 2015, at 11:43, John Thurston wrote:
In testing 4.3.20, I've noticed that the 'msgs' column (for hosts
running the xymon solaris client) contain lines from my logs even when
they are reporting 'green'. I don't really want the contents of
/var/adm/messages leaked to every viewer of my Xymon web interface.
- snip -
Is your Xymon server open to the public internet? :-)
No, but my xymon server is available to anyone on my network and handles messages from clients in several departments. I don't see any reason or value in publishing my /var/adm/messages (or spilling my process list) for everyone to read.
quoted from Mark Felder
cpu: I think it's useful to capture those moments in time especially
when you want to see what changed between green and red.
Where is the value is continually spilling the process list when it is green? When there is an alarm, I can begin to see the value. But not on a day-to-day, always-green, basis.

The client normally only reports on a 5-minute interval. When it reports 'red' on 'cpu', is there really any value in knowing what it was idling on five minutes earlier? At best I'm going to care what is sucking the resources right now. In most cases, though, I don't even care about that. In the event of a 'red' notice, I go directly to the host and use my system tools to determine what is wrong.
quoted from Mark Felder
logs: I agree; it's a waste to capture/display the contents if you're
not matching anything
Are you aware of any way to suppress this display? I'd like to stop it.

I can switch to a different client, or suppress more columns, but I'd like to find a more elegant way to reach the goal.
quoted from John Thurston
-- 
    Do things because you should, not just because you can.

John Thurston    XXX-XXX-XXXX
user-ce4d79d99bab@xymon.invalid
Enterprise Technology Services
Department of Administration
State of Alaska
list Japheth Cleaver · Wed, 13 May 2015 10:20:04 -0700 ·
quoted from John Thurston

On Wed, May 13, 2015 10:07 am, John Thurston wrote:
On 5/13/2015 8:48 AM, Mark Felder wrote:
On Wed, May 13, 2015, at 11:43, John Thurston wrote:
In testing 4.3.20, I've noticed that the 'msgs' column (for hosts
running the xymon solaris client) contain lines from my logs even when
they are reporting 'green'. I don't really want the contents of
/var/adm/messages leaked to every viewer of my Xymon web interface.
- snip -
Is your Xymon server open to the public internet? :-)
No, but my xymon server is available to anyone on my network and handles
messages from clients in several departments. I don't see any reason or
value in publishing my /var/adm/messages (or spilling my process list)
for everyone to read.
cpu: I think it's useful to capture those moments in time especially
when you want to see what changed between green and red.
Where is the value is continually spilling the process list when it is
green? When there is an alarm, I can begin to see the value. But not on
a day-to-day, always-green, basis.

The client normally only reports on a 5-minute interval. When it reports
'red' on 'cpu', is there really any value in knowing what it was idling
on five minutes earlier? At best I'm going to care what is sucking the
resources right now. In most cases, though, I don't even care about
that. In the event of a 'red' notice, I go directly to the host and use
my system tools to determine what is wrong.
logs: I agree; it's a waste to capture/display the contents if you're
not matching anything
Are you aware of any way to suppress this display? I'd like to stop it.

I can switch to a different client, or suppress more columns, but I'd
like to find a more elegant way to reach the goal.

I think it really comes down to your audience. xymon/hobbit/Big Brother
were all designed around making life easier for sysadmins, so it tends to
default showing more technical data and placing it within easy reach.


For precedence, we do have the following options to xymond_client currently:


--no-ps-listing
  Normally  the  "procs"  status  message  includes the full
process-listing received from the client. If you prefer to just have the
monitored processes shown, this option will turn off the full
ps-listing.

--no-port-listing
  Normally the "ports" status message includes the full netstat-listing
received from the client. If you prefer to just have the monitored 
ports  shown, this option will turn off the full netstat-listing.


I could see one of two paths here:

a) adding individual flags for each of the remaining test results
xymond_client generates, controlling the raw data on the status page, or

b) a single master "raw data on status" flag used for all at once

They're not mutually exclusive, of course (most restrictive flag would
win), but I'm not sure on the various use cases out there.


One thing to keep in mind is that if you're doing server-side
configuration with dynamic status message (I'd have to check on the static
HTML output, since I haven't used it in quite a while), anyone who can
read the status message will also be able to read the Client Data link,
which contains the raw data anyway. So if this is to hide data, I feel
like it really only makes sense when clients are in --local evaluation
mode.


OTOH, there are other reasons besides security to not display data in the
status message: it's more legible, and it's less data to re-transmit
internally (like netstat output on huge servers).

And finally, there are middle ground options that could be patched in.
xymongen and svcstatus.cgi still see everything in the message output as a
blob to stick into a <PRE> tag, but there's no reason a small disclosure
triange widget couldn't be used to hide the things below the '&color'
lines, which serve as de-facto sub-statuses.


Thoughts?

-jc
list Andy Smith · Wed, 13 May 2015 20:42:57 +0100 ·
quoted from Japheth Cleaver
J.C. Cleaver wrote:
On Wed, May 13, 2015 10:07 am, John Thurston wrote:
On 5/13/2015 8:48 AM, Mark Felder wrote:
On Wed, May 13, 2015, at 11:43, John Thurston wrote:
In testing 4.3.20, I've noticed that the 'msgs' column (for hosts
running the xymon solaris client) contain lines from my logs even when
they are reporting 'green'. I don't really want the contents of
/var/adm/messages leaked to every viewer of my Xymon web interface.
- snip -
Is your Xymon server open to the public internet? :-)
No, but my xymon server is available to anyone on my network and handles
messages from clients in several departments. I don't see any reason or
value in publishing my /var/adm/messages (or spilling my process list)
for everyone to read.
cpu: I think it's useful to capture those moments in time especially
when you want to see what changed between green and red.
Where is the value is continually spilling the process list when it is
green? When there is an alarm, I can begin to see the value. But not on
a day-to-day, always-green, basis.

The client normally only reports on a 5-minute interval. When it reports
'red' on 'cpu', is there really any value in knowing what it was idling
on five minutes earlier? At best I'm going to care what is sucking the
resources right now. In most cases, though, I don't even care about
that. In the event of a 'red' notice, I go directly to the host and use
my system tools to determine what is wrong.
logs: I agree; it's a waste to capture/display the contents if you're
not matching anything
Are you aware of any way to suppress this display? I'd like to stop it.

I can switch to a different client, or suppress more columns, but I'd
like to find a more elegant way to reach the goal.

I think it really comes down to your audience. xymon/hobbit/Big Brother
were all designed around making life easier for sysadmins, so it tends to
default showing more technical data and placing it within easy reach.


For precedence, we do have the following options to xymond_client currently:


--no-ps-listing
  Normally  the  "procs"  status  message  includes the full
process-listing received from the client. If you prefer to just have the
monitored processes shown, this option will turn off the full
ps-listing.

--no-port-listing
  Normally the "ports" status message includes the full netstat-listing
received from the client. If you prefer to just have the monitored 
ports  shown, this option will turn off the full netstat-listing.


I could see one of two paths here:

a) adding individual flags for each of the remaining test results
xymond_client generates, controlling the raw data on the status page, or

b) a single master "raw data on status" flag used for all at once

They're not mutually exclusive, of course (most restrictive flag would
win), but I'm not sure on the various use cases out there.


One thing to keep in mind is that if you're doing server-side
configuration with dynamic status message (I'd have to check on the static
HTML output, since I haven't used it in quite a while), anyone who can
read the status message will also be able to read the Client Data link,
which contains the raw data anyway. So if this is to hide data, I feel
like it really only makes sense when clients are in --local evaluation
mode.


OTOH, there are other reasons besides security to not display data in the
status message: it's more legible, and it's less data to re-transmit
internally (like netstat output on huge servers).

And finally, there are middle ground options that could be patched in.
xymongen and svcstatus.cgi still see everything in the message output as a
blob to stick into a <PRE> tag, but there's no reason a small disclosure
triange widget couldn't be used to hide the things below the '&color'
lines, which serve as de-facto sub-statuses.


Thoughts?

-jc
We find the listings of both procs and ports incredibly useful for post 
incident forensics, we would not want to be without it.  What about 
per-host flags similar to HIDEHTTP which performs a similar function for 
  sensitive web page checks?
-- 
Andy
list John Thurston · Wed, 13 May 2015 12:32:04 -0800 ·
quoted from Andy Smith
On 5/13/2015 9:20 AM, J.C. Cleaver wrote:

On Wed, May 13, 2015 10:07 am, John Thurston wrote:
On 5/13/2015 8:48 AM, Mark Felder wrote:
On Wed, May 13, 2015, at 11:43, John Thurston wrote:
In testing 4.3.20, I've noticed that the 'msgs' column (for hosts
running the xymon solaris client) contain lines from my logs even when
they are reporting 'green'. I don't really want the contents of
/var/adm/messages leaked to every viewer of my Xymon web interface.
- snip -
Is your Xymon server open to the public internet? :-)
No, but my xymon server is available to anyone on my network and handles
messages from clients in several departments. I don't see any reason or
value in publishing my /var/adm/messages (or spilling my process list)
for everyone to read.
. . .
I think it really comes down to your audience. xymon/hobbit/Big Brother
were all designed around making life easier for sysadmins, so it tends to
default showing more technical data and placing it within easy reach.
To me, xymon/hobbit/BB are alerting tools. Their purpose is to tell me "A threshold you defined has been exceeded. You'd better go figure out if there is a problem brewing!" When Xymon has done this, it's job is done. I don't expect it to do much more.

It's silly (to me, anyway) to think I can predict all the information I will need to diagnose or correct future host problems and pre-populate Xymon with that information. To know what information I might need, I'd need to know what problem I am going to have. If I know what problem I'm going to have, I should take preemptive steps to avoid having the problem.
quoted from Andy Smith

For precedence, we do have the following options to xymond_client currently:
--no-ps-listing
...
--no-port-listing
...

Thank you for pointing these out. This is just the sort of thing I was hoping to find. It helps clean up some of the leaked information. It still leaves me 'cpu' and 'msgs', but this is a start.
quoted from Andy Smith
I could see one of two paths here:

a) adding individual flags for each of the remaining test results
xymond_client generates, controlling the raw data on the status page, or

b) a single master "raw data on status" flag used for all at once
(a) would mean any new client capabilities would need to be tightly coupled to the server so they weren't left out. It is flexible, but it could be a hassle.

(b) would probably meet my needs nicely. I just want the status. It's nice to have an indication of what breached the threshold, but tailing /var/adm/messages into the HTML page for every green update seems pointless.  Maybe "IncludeRawData=color[,color]" so it could be included with red and yellow but not with green.

Seen below for an idea for (c).
quoted from Andy Smith
One thing to keep in mind is that if you're doing server-side
configuration with dynamic status message . . . .
Only three of my clients are in "central" mode, and those are only in that mode because they are running on my Xymon servers and come out of the build-process configured that way. All of my other clients are running older BB or BBPe clients. Which means in my case, I should probably pursue option:

(c) I suppress the columns I don't want on the three clients that display them. I suspect that my desire to suppress this information is a fringe-case and not worth the effort to incorporate.
quoted from John Thurston


-- 
    Do things because you should, not just because you can.

John Thurston    XXX-XXX-XXXX
user-ce4d79d99bab@xymon.invalid
Enterprise Technology Services
Department of Administration
State of Alaska
list Shawn Heisey · Wed, 13 May 2015 16:17:52 -0600 ·
quoted from John Thurston
On 5/13/2015 10:43 AM, John Thurston wrote:
In testing 4.3.20, I've noticed that the 'msgs' column (for hosts
running the xymon solaris client) contain lines from my logs even when
they are reporting 'green'. I don't really want the contents of
/var/adm/messages leaked to every viewer of my Xymon web interface.
In client-local.cfg, where your central mode clients get the config that
tells them what should be monitored and sent, any lines in your logs
that match an "ignore" line will NOT be sent from the client to the
server.  Just define log filters that filter out anything unimportant or
sensitive:

Here's one stanza in my client-local.cfg file, monitoring a gluster
host.  I ignore all lines at INFO (the " I " filter) and certain
error/warning lines that I do not want to even be sent from the client
to xymon:

[slc01nas2.REDACTED.com]
log:/var/log/glusterfs/etc-glusterfs-glusterd.vol.log:10240
ignore ( I )
log:/var/log/glusterfs/glustershd.log:10240
ignore ( I )
log:/var/log/glusterfs/nfs.log:10240
ignore xlator\/protocol\/client.so\(client_inodelk\+0x96\)
\[0x7ffd96db3426\]\)\)\) 0-: Assertion failed: 0
ignore <gfid:00000000-0000-0000-0000-000000000000> failed .Invalid argument
ignore ( I )
log:/var/log/messages:10240

Here's the corresponding entry in analysis.cfg that actually produces
alarms:

HOST=%slc01(dfs|nas)
        LOG %gluster "%( E )" COLOR=red
        PROC glusterd 1 red
        PROC glusterfs 1 red
        MEMACT  99 100

Thanks,
Shawn
list Jeremy Laidman · Thu, 14 May 2015 12:37:35 +1000 ·
quoted from John Thurston
On 14 May 2015 at 06:32, John Thurston <user-ce4d79d99bab@xymon.invalid> wrote:
To me, xymon/hobbit/BB are alerting tools. Their purpose is to tell me "A
threshold you defined has been exceeded. You'd better go figure out if
there is a problem brewing!" When Xymon has done this, it's job is done. I
don't expect it to do much more.
Personally, Xymon is much more than for alerting.  It's also critical for
forensics.  When a fault has been detected, the graphs and snapshot reports
are extremely valuable for working out what historical factors may be
relevant to a fault.

Two ways I use Xymon for forensics:

1) If an event has a history, there might be a pattern that can enlighten
the cause (eg disk space problems at the start of every month) or a
coincident event (eg packet loss concurrent with a spike in disk I/O).

2) If a threshold measure has a short-term spike or a long-term slow
increase, then identifying when the metric started its incline can help pin
down the change or event that caused it.

"Go fix it" helps with the immediate problem and it's purpose is tactical,
for the short term.  But looking to the past can help prevent recurrence in
the future.

In the specific case of a CPU load fault, it can be valuable to know what
processes are new - in other words, what wasn't running 5 minutes before
the event, that was running after the event.  In some cases a new process
lifetime can be gleaned from the STIME column in the output of "ps -ef".
In other cases, it might be a process that is run from cron or inetd, or in
a while loop, and doesn't have a very long lifetime.  Or there might be a
situation where you have a clean-up process that has crashed, and you might
want to know what was running that is no longer. In reality, these are
somewhat contrived scenarios, and I have no concrete examples to prove that
it can happen.  But in your own words, it's "silly to think [we] can
predict all the information [we'll] need", and so in my opinion (and
experience) the more, the better.

If security is the problem, then secure the data.  Suppressing the data is
only one way to secure the data, and doing so can have down-sides.

In my deployment, I limit unauthenticated access to Apache, so those who
don't need to see my log files and process listings, don't get to see them,
but those who might benefit, can see them.

Cheers
Jeremy
list Japheth Cleaver · Thu, 14 May 2015 12:47:28 -0700 ·
quoted from John Thurston
On Wed, May 13, 2015 1:32 pm, John Thurston wrote:
On 5/13/2015 9:20 AM, J.C. Cleaver wrote:
I could see one of two paths here:

a) adding individual flags for each of the remaining test results
xymond_client generates, controlling the raw data on the status page, or

b) a single master "raw data on status" flag used for all at once
(a) would mean any new client capabilities would need to be tightly
coupled to the server so they weren't left out. It is flexible, but it
could be a hassle.

(b) would probably meet my needs nicely. I just want the status. It's
nice to have an indication of what breached the threshold, but tailing
/var/adm/messages into the HTML page for every green update seems
pointless.  Maybe "IncludeRawData=color[,color]" so it could be included
with red and yellow but not with green.

Seen below for an idea for (c).
I actually probably should have clarified these as xymond_client command
line options rather than flags per se. Although it does bring up an issue
in that the command line is hard-coded for local-configuration mode users.
The only way to modify that is to edit the xymonclient.sh script.

Although an environment flag (cf. (c)) could be useful, a
"LOCALCLIENTOPTS=" variable in clientlaunch.cfg would allow arbitrary
options to be passed to xymond_client in these case.
quoted from John Thurston

One thing to keep in mind is that if you're doing server-side
configuration with dynamic status message . . . .
Only three of my clients are in "central" mode, and those are only in
that mode because they are running on my Xymon servers and come out of
the build-process configured that way. All of my other clients are
running older BB or BBPe clients. Which means in my case, I should
probably pursue option:

This is correct. If the BB clients are creating status messages rather
than transmitting the raw client messages, then you'd need to
edit/configure things there. It's morally equivalent to Xymon's local
mode.
quoted from John Thurston

(c) I suppress the columns I don't want on the three clients that
display them. I suspect that my desire to suppress this information is a
fringe-case and not worth the effort to incorporate.

On Wed, May 13, 2015 12:42 pm, Andy Smith wrote:
What about
per-host flags similar to HIDEHTTP which performs a similar function for
  sensitive web page checks?
--

I think there's a use case for for both types of control here. A set of
xymond_client --options (+a generic option) for not including anything
beyond threshold evaluations in the status messages generated. This helps
the --local config use case as well as provides easy global server
control(*). Secondly, a 'HIDECLIENTDATA' flag in hosts.cfg read in by
xymond_client could give the same effect on a per-host basis for specific
exceptions.


*svcstatus.cgi could be altered to inhibit CLIENTLOG display for a host
with this setting in hosts.cfg too, but given that the data was still
present in the client channel to begin with, that's only a surface level
of security. The "best" solution will be to use --local mode and never
send that raw data to begin with.


I don't think either of these types of options are ready for 4.3.20, but
it's probably something that can be put into the next release pretty
easily.


Regards,

-jc