Xymon Mailing List Archive search

Integer underflow in FILE mtime/ctime/atime check

list Axel Beckert
Thu, 13 Aug 2020 12:15:00 +0200
Message-Id: <user-761fc80d7b63@xymon.invalid>

Hi Jeremy,

thanks a lot for your outside view. Maybe I was too focussed on the
integer underrun to not see potential other issues...

On Thu, Aug 13, 2020 at 01:59:50PM +1000, Jeremy Laidman wrote:
On Thu, 13 Aug 2020 at 13:18, Axel Beckert <user-bc188e45dae4@xymon.invalid> wrote:
at least in Xymon 4.2.28 (plus recent CVE patches, i.e. as currently
in Debian Stable) there is an issue with an integer underflow in the
FILE mtime/ctime/atime check which basically seems to happen when
finding the file to check takes a moment (seconds) and the file in
question has been modified after the search for the file has been
started.
Is it possible that the time on the host being monitored is in the future,
at least compared to the time of the Xymon server?
Hmmmm, you've got a point there. I expected all our servers to have
NTP, but this server in question was setup manually by a coworker and
indeed has the wrong time ? but it's like 15 seconds behind the Xymon
server (which definitely has NTP and I also compared it with two other
machines ? then again the box in question has an ntp installed, but it
wasn't running. I should probably monitor this. ;-)

Client with timestamp issues: Thu Aug 13 11:53:03 CEST 2020
Xymon Server:	      	      Thu 13 Aug 2020 09:53:18 AM UTC

I assume the different time zones (CEST being UTC+02:00) are no issue
as we have quite some clients on CEST.

Here's again the same difference just in Unix Epoch:

Client: 1597312461
Server: 1597312477

Always about 15 to 16 seconds difference, but the client is lagging behind.
Can you provide the [clock] and [file:/nfs/...] sections of your client
data message for an instance when the underflow has occurred?
Yep:

[file:/nf/2020/08/12/nfcapd.202008122015]
type:100000 (file)
mode:644 (-rw-r--r--)
linkcount:1
owner:56137 (nfsen)
group:48 (apache)
size:232673177
clock:1597256407 (2020/08/12-20:20:07)
atime:1597256114 (2020/08/12-20:15:14)
ctime:1597256414 (2020/08/12-20:20:14)
mtime:1597256414 (2020/08/12-20:20:14)

[?]

[clock]
epoch: 1597256407.516640
local: 2020-08-12 20:20:07 CEST
UTC: 2020-08-12 18:20:07 GMT

(BTW, is it normal that the [clientversion] section is empty?)
It's hard to see how a delay in the collection of file timestamps could
cause this underflow, as the creation of the [clock] section (the source of
"now") is executed after the creation of the [file:] section (the source of
the MTIME value).
Good point. Solely from the facts above I'd expected that it is
evaluated (but not printed) first.

Then again, due to that empty [clientversion] section, I checked the
actually installed xymon-client version and it's a horribly old client
version (4.3.10). (It seems as if I really should get rid of manually
installed servers or align those boxes even more with Ansible. :-)

So forget this if this is solely caused by the ancient client. (And
sorry for the noise because I just gave the server version and didn't
even think to check the client version.)
However, If the [clock] section of the client message does not exist, the
the Xymon server will use its own time for its calculations.
It does exist.
In such cases this integer underflow happens and causes a falso
positive due to instead of the time difference being negative, it's
insanely huge:
No matter the cause, Xymon should take into account the possibility that
the timestamp is in the future, and at least show "was modified N seconds
into the future" or something similar, after handling or avoiding the
underflow.
Ack. Sometimes files copied from other machines with time in the
future could be copied onto (or mounted on) a system with proper time.
Maybe there are even applications where future timestamps make sense,
i.e. if they're being misused for storing some date or so, who knows.
Perhaps create the following entry in clientlocal.cfg:

file:`exec >/tmp/clock-test 2>&1; date; $XYMONHOME/bin/logfetch --clock ;
ls -l $(ls -1d /nf/2???/??/??/ | tail -1)* | tail -1;
$XYMONHOME/bin/logfetch --clock; date`

The output file /tmp/clock-test might give you some idea of what's going
on. Remember that it can take up to 10 minutes for updates to
clientlocal.cfg to take effect on the client.
Will do in case the other facts above do not suffice already.

		Kind regards, Axel
-- 
PGP: 2FF9CD59612616B5      /~\  Plain Text Ribbon Campaign, http://arc.pasp.de/
Mail: user-bc188e45dae4@xymon.invalid  \ /  Say No to HTML in E-Mail and Usenet
Mail+Jabber: user-0064bde8d49d@xymon.invalid  X
https://axel.beckert.ch/   / \  I love long mails: https://email.is-not-s.ms/