Xymon Mailing List Archive search

Bug? procs test going red green every hour

8 messages in this thread

list John Horne · Thu, 21 Mar 2024 15:38:45 +0000 ·
Hello,

We are using Xymon 4.3.30 from the Terabithia rpms on Linux servers.
Sorry the message is a bit long; necessary in order to explain what is going
on.

We have recently noticed that the 'procs' test goes red every hour, and then 5
mins later changes to green. This happens just after the hour, and has only
been seen on 4 servers. The procs test indicates a process (either 'clamscan'
or 'freshclam') of the ClamAV package is not present.

Looking into this I can see that the Xymon 'hostdata' output uses the 'ps'
command to get the current processes. The actual 'ps' command is defined in
'xymonclient-linux.sh' as:
ps -Aww f -o pid,ppid,user,start,state,pri,pcpu,time:12,pmem,rsz:10,vsz:10,cmd

However, looking at the actual hostdata file for one of the servers shows:

=====
1197212       1 root       Mar 10 S  19  0.0     00:00:06  0.0       3660
6072 /usr/sbin/crond -n
2822567 1197212 root     15:01:00 S  19  0.0     00:00:00  0.0       5168
12436  \_ /usr/sbin/CROND -n
2822568 2822567 root     15:01:00 S  19  0.0     00:00:00  0.0       3464
7124      \_ /usr/bin/bash /bin/run-parts /etc/cron.hourly
2822580 2822568 root     15:01:00 S  19  0.0     00:00:00  0.0       3388
7124          \_ /bin/bash /etc/cron.hourly/security.cron
2822581 2822568 root     15:01:00 S  19  0.0     00:00:00  0.0       1104
4016          \_ sed 1i\ /etc/cron.hourly/security.cron: 611279       1
clamscan   Mar 14 S  19  0.0     00:03:23  8.6    1389464    1573272 /usr/sbin/
clamd -c /etc/clamd.d/scan.conf
=====

('security.cron' is just an in-house script.) As can be seen 'crond/CROND' runs
'run-parts', which in turn runs the 'cron.hourly' jobs (security.cron).
However, that last line is corrupted - it is running a 'sed' command, but has
the following 'ps' output command appended to it (the one with
'/usr/sbin/clamd').
Hence, the 'procs' test sees the 'sed' command (we don't monitor for that), but
misses the '/usr/sbin/clamd' command (which we do monitor) because it is all on
one line. So the 'procs' test goes red. Five mins later, when the various jobs
have ended, the clamd command is again seen within the 'ps' output because
'run-parts' is not running.

So, looking at 'run-parts' (it's a shell script) shows that how it actually
runs the cron jobs is by:

=====
# run executable files
logger -p cron.notice -t "run-parts[$$]" "($1) starting $(basename $i)"
$i 2>&1 | sed '1i\
'"$i"':\
'
logger -p cron.notice -t "run-parts[$$]" "($1) finished $(basename $i)"
=====

This where the 'sed' command comes from, and as can be seen it outputs newlines
using the escape character.

I have tested by changing the 'sed' command into just one line by replacing the
newlines with "\n". This works fine, and the 'procs' test remains green.
I also noticed that on CentOS 7 servers 'run-parts' uses 'awk' rather than
'sed'. So this is why the problem isn't seen on those servers - it is only seen
on Rocky 8 and 9 Linux servers.
I have also stopped the clamd process in order that some other process follows
the run-parts one. The hostdata file again shows the new process as being
appended to the sed command. If we are monitoring that new process, then again
procs goes red.

I need to do more testing, but am a little lost as to whether the bug (if it
exists) is in the 'ps' output, the way it is recorded in the hostdata file or
in the processing of the 'procs' test.


John.

--
John Horne | Senior Operations Analyst | Technology and Information Services
University of Plymouth | Drake Circus | Plymouth | Devon | PL4 8AA | UK
[https://www.plymouth.ac.uk/images/email_footer.gif]<http://www.plymouth.ac.uk/worldclass>;

This email and any files with it are confidential and intended solely for the use of the recipient to whom it is addressed. If you are not the intended recipient then copying, distribution or other use of the information contained is strictly prohibited and you should not rely on it. If you have received this email in error please let the sender know immediately and delete it from your system(s). Internet emails are not necessarily secure. While we take every care, University of Plymouth accepts no responsibility for viruses and it is your responsibility to scan emails and their attachments. University of Plymouth does not accept responsibility for any changes made after it was sent. Nothing in this email or its attachments constitutes an order for goods or services unless accompanied by an official order form.
list John Horne · Thu, 21 Mar 2024 16:20:58 +0000 ·
quoted from John Horne
On Thu, 2024-03-21 at 15:38 +0000, John Horne wrote:
I need to do more testing, but am a little lost as to whether the bug (if it
exists) is in the 'ps' output, the way it is recorded in the hostdata file or
in the processing of the 'procs' test.
Running a test and checking the 'ps' output shows that it is okay. Using a
small script ('./aaa') and running the ps command shows:

=====
3023287 3023286 john     15:57:36 S  19  0.4     00:00:04  0.0       5136
8312  \_ -bash
3235462 3023287 john     16:14:08 S  19  0.0     00:00:00  0.0       3768
7256      \_ /bin/bash ./aaa
3235812 3235462 john     16:15:05 S  19  0.0     00:00:00  0.0       1028
3056      |   \_ sleep 3
3235463 3023287 john     16:14:08 S  19  0.0     00:00:00  0.0       1108
4016      \_ sed 1i\ ./aaa:\
3018834 3018821 john     15:38:54 S  19  0.0     00:00:00  0.0       7240
19036 sshd: john at pts/0
=====

As expected the sed command shows the script name followed by a colon and
escape characater. There is no corruption of the following line (sshd) being
appended to it.
quoted from John Horne


John.

--
John Horne | Senior Operations Analyst | Technology and Information Services
University of Plymouth | Drake Circus | Plymouth | Devon | PL4 8AA | UK
[https://www.plymouth.ac.uk/images/email_footer.gif]<http://www.plymouth.ac.uk/worldclass>;

This email and any files with it are confidential and intended solely for the use of the recipient to whom it is addressed. If you are not the intended recipient then copying, distribution or other use of the information contained is strictly prohibited and you should not rely on it. If you have received this email in error please let the sender know immediately and delete it from your system(s). Internet emails are not necessarily secure. While we take every care, University of Plymouth accepts no responsibility for viruses and it is your responsibility to scan emails and their attachments. University of Plymouth does not accept responsibility for any changes made after it was sent. Nothing in this email or its attachments constitutes an order for goods or services unless accompanied by an official order form.
list John Horne · Thu, 21 Mar 2024 20:11:08 +0000 ·
quoted from John Horne
On Thu, 2024-03-21 at 15:38 +0000, John Horne wrote:
I need to do more testing, but am a little lost as to whether the bug (if it
exists) is in the 'ps' output, the way it is recorded in the hostdata file or
in the processing of the 'procs' test.
Running tcpdump of what is being sent to the main Xymon server shows that the
corrupted line is occurring on the client. So I need to look into the
xymonclient side of things.
quoted from John Horne


John.

--
John Horne | Senior Operations Analyst | Technology and Information Services
University of Plymouth | Drake Circus | Plymouth | Devon | PL4 8AA | UK
[https://www.plymouth.ac.uk/images/email_footer.gif]<http://www.plymouth.ac.uk/worldclass>;

This email and any files with it are confidential and intended solely for the use of the recipient to whom it is addressed. If you are not the intended recipient then copying, distribution or other use of the information contained is strictly prohibited and you should not rely on it. If you have received this email in error please let the sender know immediately and delete it from your system(s). Internet emails are not necessarily secure. While we take every care, University of Plymouth accepts no responsibility for viruses and it is your responsibility to scan emails and their attachments. University of Plymouth does not accept responsibility for any changes made after it was sent. Nothing in this email or its attachments constitutes an order for goods or services unless accompanied by an official order form.
list Jeremy Laidman · Fri, 22 Mar 2024 10:10:06 +1100 ·
This seems to be an artefact of the "xymon" command, perhaps sanitising
input. If I run this command:

printf "data `uname -n`.TEST\ntesting\    \ntesting\n" | { telnet 127.1
1984; sleep 1; }

the message gets to xymon as is, with a backslash and some trailing
spaces, as in "testing\<space><space><space><newline>testing<newline>". If
I change the command after the pipe to xymon, like so:

printf "data `uname -n`.TEST\ntesting\    \ntesting\n" | xymon 127.1 @

then the message appears as "testingtesting" with all of the whitespace
stripped out. This effect happens for when one or more spaces or tabs
follows a backslash and is then followed by a newline. (Interestingly, a
carriage return in the whitespace seems to also corrupt the string after
the newline - possibly leading to a buffer overflow in some cases - and
while this is unlikely in the output of "ps", there may be other ways to
abuse xymon with this technique.)

So I think the issue is triggered for you when the ps output has "sed ...
security.cron:<space>".

I suspect if you clean up the output of the "ps" line in
xymonclient-linux.sh to remove trailing whitespace, then it might fix your
problem. Something like this:

ps -Aww f -o
pid,ppid,user,start,state,pri,pcpu,time:12,pmem,rsz:10,vsz:10,cmd | sed 's/
*$//'

J
quoted from John Horne


On Fri, 22 Mar 2024 at 07:11, John Horne <user-e95f1ec2f147@xymon.invalid> wrote:
On Thu, 2024-03-21 at 15:38 +0000, John Horne wrote:
I need to do more testing, but am a little lost as to whether the bug
(if it
exists) is in the 'ps' output, the way it is recorded in the hostdata
file or
in the processing of the 'procs' test.
Running tcpdump of what is being sent to the main Xymon server shows that
the
corrupted line is occurring on the client. So I need to look into the
xymonclient side of things.


John.

--
John Horne | Senior Operations Analyst | Technology and Information
Services
University of Plymouth | Drake Circus | Plymouth | Devon | PL4 8AA | UK

[https://www.plymouth.ac.uk/images/email_footer.gif]<;
http://www.plymouth.ac.uk/worldclass>;
quoted from John Horne

This email and any files with it are confidential and intended solely for
the use of the recipient to whom it is addressed. If you are not the
intended recipient then copying, distribution or other use of the
information contained is strictly prohibited and you should not rely on it.
If you have received this email in error please let the sender know
immediately and delete it from your system(s). Internet emails are not
necessarily secure. While we take every care, University of Plymouth
accepts no responsibility for viruses and it is your responsibility to scan
emails and their attachments. University of Plymouth does not accept
responsibility for any changes made after it was sent. Nothing in this
email or its attachments constitutes an order for goods or services unless
accompanied by an official order form.

list John Horne · Fri, 22 Mar 2024 13:04:58 +0000 ·
quoted from Jeremy Laidman
On Fri, 2024-03-22 at 10:10 +1100, Jeremy Laidman wrote:
This seems to be an artefact of the "xymon" command, perhaps sanitising
input.
I would agree.
quoted from Jeremy Laidman
So I think the issue is triggered for you when the ps output has "sed ...
security.cron:<space>".

I suspect if you clean up the output of the "ps" line in xymonclient-linux.sh
to remove trailing whitespace, then it might fix your problem. Something like
this:

ps -Aww f -o
pid,ppid,user,start,state,pri,pcpu,time:12,pmem,rsz:10,vsz:10,cmd | sed 's/
*$//'
Unfortunately that doesn't work.

I am more wondering if xymon is somehow seeing or at least treating the 'sed'
command as if it had an escaped newline at the end (perhaps after stripping
spaces). So it appends the following line to it.

If I modify the current 'sed' command to make it all on one line (no escape
characters), then it all works fine.

Ironically I have found that in our local analysis.cfg file the clamd/freshclam
processes were configured as:

PROC    "%^/usr/sbin/clamd -c" TEXT=clamd
PROC    "%^/usr/bin/freshclam -d" TEXT=freshclam

So the commands must be at the beginning of the command line. Looking at
'hostdata' for other servers I could see that other commands we monitor were
corrupted by the run-parts sed command. However, these did not cause the procs
test to go red. The reason was, using 'dnsmasq' as an example, because they
were defined differently in the analysis.cfg file:

PROC    /usr/sbin/dnsmasq TEXT=dnsmasq

In this case what is looked for ('/usr/sbin/dnsmasq') is still found regardless
of whether it appears at the start of the command-line or not. So reconfiguring
the clamd and freshclam entries allows them to work as well (i.e the procs test
remains green).
quoted from Jeremy Laidman


John.

--
John Horne | Senior Operations Analyst | Technology and Information Services
University of Plymouth | Drake Circus | Plymouth | Devon | PL4 8AA | UK
[https://www.plymouth.ac.uk/images/email_footer.gif]<http://www.plymouth.ac.uk/worldclass>;

This email and any files with it are confidential and intended solely for the use of the recipient to whom it is addressed. If you are not the intended recipient then copying, distribution or other use of the information contained is strictly prohibited and you should not rely on it. If you have received this email in error please let the sender know immediately and delete it from your system(s). Internet emails are not necessarily secure. While we take every care, University of Plymouth accepts no responsibility for viruses and it is your responsibility to scan emails and their attachments. University of Plymouth does not accept responsibility for any changes made after it was sent. Nothing in this email or its attachments constitutes an order for goods or services unless accompanied by an official order form.
list John Horne · Fri, 22 Mar 2024 15:33:45 +0000 ·
quoted from John Horne
On Fri, 2024-03-22 at 13:04 +0000, John Horne wrote:
On Fri, 2024-03-22 at 10:10 +1100, Jeremy Laidman wrote:
This seems to be an artefact of the "xymon" command, perhaps sanitising
input.
I would agree.

Interestingly in the source lib/stackio.c (line 58) is the input comment:

=====
 * Simultaneously, lines ending with a '\' character are
 * merged into one line, allowing for transparent handling
 * of very long lines.
=====

Unfortunately I've got to leave work in a minute, but I would say this could be
where the lines are merged. The comment comes from the 'unlimfgets' routine,
which is used by the xymon command (common/xymon.c) to read input.
quoted from John Horne


John.

--
John Horne | Senior Operations Analyst | Technology and Information Services
University of Plymouth | Drake Circus | Plymouth | Devon | PL4 8AA | UK
[https://www.plymouth.ac.uk/images/email_footer.gif]<http://www.plymouth.ac.uk/worldclass>;

This email and any files with it are confidential and intended solely for the use of the recipient to whom it is addressed. If you are not the intended recipient then copying, distribution or other use of the information contained is strictly prohibited and you should not rely on it. If you have received this email in error please let the sender know immediately and delete it from your system(s). Internet emails are not necessarily secure. While we take every care, University of Plymouth accepts no responsibility for viruses and it is your responsibility to scan emails and their attachments. University of Plymouth does not accept responsibility for any changes made after it was sent. Nothing in this email or its attachments constitutes an order for goods or services unless accompanied by an official order form.
list Jeremy Laidman · Mon, 25 Mar 2024 10:46:11 +1100 ·
Hmm, interesting. So it looks like the Xymon command is written to support
multi-line input, such as this sort of thing:

echo "status $HOST.test\\" > $TMPFILE # continued
echo "green \\" >> $TMPFILE # continued
echo "`date`\\" >> $TMPFILE # continued
echo "Test X is OK" >> $TMPFILE # not continued
echo "The test X was tested and found to be OK" >> $TMPFILE
$XYMON $XYMSRV < $TMPFILE

so that a single (eg status) line can be split over multiple input lines by
using backslash at EOL.

The code in stackio.c that joins lines due to a trailing backslash seems to
skip over any spaces immediately after the backslash (eg "first
line...\<sp><sp><sp><lf>...first line continued").

I note that "ps" displays the multi-line "sed" command with a trailing
space.

$ $ ps -f | sed '1i\
'|grep '[s]ed'|tr ' ' '.'
jeremyl...3104..2448..0.10:31.pts/3....00:00:00.sed.1i\.

There is no trailing space in the process args, so ps must be adding it.
Perhaps this is to de-fang any backslashes, precisely to avoid what seems
to be happening with the xymon client.

So perhaps "escaping" a "\<sp>" sequence at the end of a line would fix
this. Something like:
quoted from John Horne

ps -Aww f -o pid,ppid,user,start,state,pri,pcpu,time:12,pmem,rsz:10,vsz:10,cmd
| sed 's/\\  *$/\\./'

J

On Sat, 23 Mar 2024 at 02:34, John Horne <user-e95f1ec2f147@xymon.invalid> wrote:
On Fri, 2024-03-22 at 13:04 +0000, John Horne wrote:
On Fri, 2024-03-22 at 10:10 +1100, Jeremy Laidman wrote:
This seems to be an artefact of the "xymon" command, perhaps sanitising
input.
I would agree.

Interestingly in the source lib/stackio.c (line 58) is the input comment:

=====
 * Simultaneously, lines ending with a '\' character are
 * merged into one line, allowing for transparent handling
 * of very long lines.
=====

Unfortunately I've got to leave work in a minute, but I would say this
could be
where the lines are merged. The comment comes from the 'unlimfgets'
routine,
which is used by the xymon command (common/xymon.c) to read input.


John.

--
John Horne | Senior Operations Analyst | Technology and Information
Services
University of Plymouth | Drake Circus | Plymouth | Devon | PL4 8AA | UK
[https://www.plymouth.ac.uk/images/email_footer.gif]<;
http://www.plymouth.ac.uk/worldclass>;

This email and any files with it are confidential and intended solely for
the use of the recipient to whom it is addressed. If you are not the
intended recipient then copying, distribution or other use of the
information contained is strictly prohibited and you should not rely on it.
If you have received this email in error please let the sender know
immediately and delete it from your system(s). Internet emails are not
necessarily secure. While we take every care, University of Plymouth
accepts no responsibility for viruses and it is your responsibility to scan
emails and their attachments. University of Plymouth does not accept
responsibility for any changes made after it was sent. Nothing in this
email or its attachments constitutes an order for goods or services unless
accompanied by an official order form.

list John Horne · Tue, 26 Mar 2024 10:44:47 +0000 ·
quoted from Jeremy Laidman
On Mon, 2024-03-25 at 10:46 +1100, Jeremy Laidman wrote:
There is no trailing space in the process args, so ps must be adding it.
Perhaps this is to de-fang any backslashes, precisely to avoid what seems to
be happening with the xymon client.

So perhaps "escaping" a "\<sp>" sequence at the end of a line would fix this.
Something like:

ps -Aww f -
o pid,ppid,user,start,state,pri,pcpu,time:12,pmem,rsz:10,vsz:10,cmd | sed
's/\\  *$/\\./'
Yes, this seems to work fine. Thanks for that.
quoted from Jeremy Laidman


John.

--
John Horne | Senior Operations Analyst | Technology and Information Services
University of Plymouth | Drake Circus | Plymouth | Devon | PL4 8AA | UK
[https://www.plymouth.ac.uk/images/email_footer.gif]<http://www.plymouth.ac.uk/worldclass>;

This email and any files with it are confidential and intended solely for the use of the recipient to whom it is addressed. If you are not the intended recipient then copying, distribution or other use of the information contained is strictly prohibited and you should not rely on it. If you have received this email in error please let the sender know immediately and delete it from your system(s). Internet emails are not necessarily secure. While we take every care, University of Plymouth accepts no responsibility for viruses and it is your responsibility to scan emails and their attachments. University of Plymouth does not accept responsibility for any changes made after it was sent. Nothing in this email or its attachments constitutes an order for goods or services unless accompanied by an official order form.