On Thu, August 17, 2023 16:44, Jeremy Laidman wrote:
On Fri, 18 Aug 2023 at 04:27, J.C. Cleaver <user-87556346d4af@xymon.invalid> wrote:
On Mon, August 14, 2023 21:10, Berry van Sleeuwen wrote:
We recently migrated our server to SLES15 SP4 and found there are a
few
network tools missing in the base, arp, netstat, ifconfig and route
are
supposed to be replaced by "ip" and "ss" command. While similar the
output
of these commands differs from the traditional tools so I guess that
would
interfere with the processing of the command output. For now it's
solved
by installing net-tools-deprecated but this might not be available in
future versions so we might need support for these commands. I don't
now
if that is also the direction for other distributions, but it's at
least
the case for Suse and OpenSuse.
Agreed; it's similar in the RHEL side. xymond_client updates that can
interpret the output of ip and ss are probably called for now. While
deprecated net-tools will stick around for sure for current systems,
it's
only a matter of time until they're removed, and in the meantime it's
one
less package to have to pull in for compatibility (and to explain).
This probably goes higher on the list.
Two brief comments about this.
To me, xymond_client always seemed to be a good candidate for making it a
bit more modular. The code is written in a modular way, to make it fairly
easy to add new ways of collecting the same metrics from different OSes
and
OS versions (which would presumably make it possible for even me to add
support for "ip" and "ss", albeit with bloated code and buffer overruns).
But I thought it would be neat to be able to plug in some kind of run-time
process to handle new scenarios - bit it a dynamic library, or a shared
memory protocol or a worker module launched by xymond_channel or similar
and written in whatever language was available and familiar to the
sysadmin
at the time.
This would definitely be a useful feature, with the most performant option
I assume being a dynamic library module framework of some sort for common
cases. I'm afraid designing this type of module system is a bit out of my
baliwick.
Then again, many (I suspect most) xymon users don't have the intense
performance needs at the 1000s/msgs/s range. I'd prefer not to encourage
*adding* forking on a per message basis, so a socket method would be on
the table.
Then again, xymond_channel basically *is* a socket and I suspect the
impact of multiple 'client' channel listeners again is only going to
become apparent at massive scale as well. This was the impetus for the
--meta(ex)filter patches (https://sourceforge.net/p/xymon/code/7868/), as
well as --multilocal (https://sourceforge.net/p/xymon/code/7811/) and a
lot of other tuning (https://sourceforge.net/p/xymon/code/7813/). We had a
*lot* of non-*nix "client"data coming through and we wanted to make sure
we were processing as little unnecessary data as possible so we could
quickly move onto the next one.
One can also fork off to (the rather unfortunately named) "pee" utility
from moreutils
(https://www.putorius.net/linux-pee-command-tee-standard-input-into-pipes.html)
and let multiple pipes read from STDIN, as long as they all can perform
efficiently themselves; ultimately this is what we did. We combined all of
our "extra" *nix tests (such as interpreting the output of /proc/mounts
looking for disks that had flipped into ro-mode and generating a status
msg for it) into a single large perl script that simply ran off the same
linux client channel listener that xymond_client did.
Given that most users won't have the kind of performance scale
considerations we did, I wonder if what's needed more here isn't just
*standardization* of add-ons (as mentioned elsewhere in the thread).
Example: A drop-in (.d) location that provides a tasks.cfg snippet
specifying what messages it cares about (or enough info for a
plugin-generator to craft one) and a channel listener script directory
with an executable that only gets the filtered messages it cares about,
and is responsible for reinjecting status messages at its desire.
There would also need to be a facility for easily adding graph
definitions, etc.
Perhaps what would be most helpful would simply be packagization templates
that provide the file drops in the necessary locations directly, and/or a
"plugins" directory that is scanned for relevant snippets a level down
when found (e.g., "plugins/foobar/tasks.d/* ; plugins/foobar/graphs.d/*")
to allow these to be distributed as simple tarballs.
The emergence of "sar" as a universally available system metric reporting
tool seems to solve this problem in a different way. If xymond_client had
a
"sar" module, it could pretty much support any popular modern OS apart
from
Windows (so Linux, Solaris, MacOS/*BSD, HPUX, even IRIX). While sar
provides only a subset of the info obtained from the client script (so it
couldn't replace ss/netstat or ip/ifconfig) it would reduce the overhead
of
having to support a range of different tools for plenty of metrics. It
would probably standardise the output of metrics so that the parsing cond
in xymond_client can be much simpler, easier to write and maintain, and
less likely to have bugs. Generally speaking, parsing is something that
can
be difficult to do safely; programs that parse files and data streams are
notoriously common targets for hackers.
I had to check this as well, as I was sure some [sar] reading had been
added in, but t'was not the case.
One experiment I had was running rotating 30s for 5m sadc collectors in a
similar manner to how vmstat is executed in the client. This is at the
very least helpful for hostdata client snapshots, but needed a processor
on the server side, like you say, to make better use of it.
Any further work on Xymon needs to ensure that it works with up-to-date
OSes, and anything that can be done to make that easier is likely to help
the cause.
Agreed 100%.
Touching on the collector/processor distinction though: Xymon has had the
client/local facility for quite a while now, dating back to 4.3.7
(https://sourceforge.net/p/xymon/code/6800/) but I'm not sure how
well-known it is.
In the Terabithia packages (and 4.4:
https://sourceforge.net/p/xymon/code/7755/) there's a parallel "/sections"
directory that works the same way but doesn't pre-pend "local:" to the
section name (intended for site packages vs custom per-box scripting).
-jc