Xymon Mailing List Archive search

Histlog file format anomaly - epochtimestamps

5 messages in this thread

list Jeremy Laidman · Fri, 25 Aug 2023 12:42:37 +1000 ·
Hi y'all

We've setup some new RHEL 7.9 Xymon servers running 4.3.30-1 from the
Terabithia repo. Everything seems to work fine except that the event
history list ("last 50 events") has links that use the epoch time format
for the individual events, but the files in $XYMONHISTLOGS directories
contain files in the old format such as Fri_Aug_25_09:59:03_2023. When I
click on an event dot, I'm given a webpage with the message "Historical
status log not available". If I rename/symlink the file to the epoch time
format (eg 1692921543) then it displays just fine.

I'm aware of a patch in the Terabithia packages that change the filename
format. But reading through the patch (from the SRPM), it looks like it's
supposed to handle both old and new formats just fine. One of the things
the patch provides is the ability to override the filename format using
either the "--epochtimestamps" switch or setting the "EPOCHIST" variable,
to (I think) tell "xymond_history" which format to use when creating new
history logfiles. Without these, the code should assume write format, but
read either format.

So, a couple of things here:

1) With the patch in place, new history log files should be created using
the new filename format. But they're being created with the old format.

2) With the patch in place, existing history log files with filenames in
the old format should still be readable by the CGI scripts. But they
history.sh file is failing when it can't find filenames in the new format.

Running strings on the Xymon binaries doesn't show the strings
"epochtimestamps" or "EPOCHHIST". So I don't think the patch has been fully
applied. And I don't think I can use the --epochtimestamps option to fix it.

So I'm not sure if I've done something fundamentally wrong here (probably),
or (less likely) if the patch in the SRPM has not cleanly applied when the
the RPM was built. Does anyone else have the Terabithia 4.3.30-1 package
installed on RHELv7.9, who can tell me what format your history status
files (under /var/lib/xymon/histlog/<hostname>/<testname>) are in? And can
tell me if the history events display correctly?

J
list Japheth Cleaver · Fri, 25 Aug 2023 02:23:39 -0700 ·
quoted from Jeremy Laidman
On Thu, August 24, 2023 19:42, Jeremy Laidman wrote:
Hi y'all

We've setup some new RHEL 7.9 Xymon servers running 4.3.30-1 from the
Terabithia repo. Everything seems to work fine except that the event
history list ("last 50 events") has links that use the epoch time format
for the individual events, but the files in $XYMONHISTLOGS directories
contain files in the old format such as Fri_Aug_25_09:59:03_2023. When I
click on an event dot, I'm given a webpage with the message "Historical
status log not available". If I rename/symlink the file to the epoch time
format (eg 1692921543) then it displays just fine.
I had to double-check this one, but I can confirm that on Terabithia (w/o
any further configuration) the links generated look something like:
http://127.0.0.1:2180/xymon-cgi/historylog.sh?HOST=trampoline.local&SERVICE=memory&TIMEBUF=1692950158

However the files in the directory are:
[root at trampoline histlogs]# find -name Fri_Aug_25_00:55:58_2023
./trampoline_local/cpu/Fri_Aug_25_00:55:58_2023
./trampoline_local/disk/Fri_Aug_25_00:55:58_2023
./trampoline_local/inode/Fri_Aug_25_00:55:58_2023
./trampoline_local/procs/Fri_Aug_25_00:55:58_2023
./trampoline_local/ports/Fri_Aug_25_00:55:58_2023
./trampoline_local/msgs/Fri_Aug_25_00:55:58_2023
./trampoline_local/files/Fri_Aug_25_00:55:58_2023
./trampoline_local/memory/Fri_Aug_25_00:55:58_2023

And they are being pulled up correctly.

After adding --epochtimestamps to the xymond_history CMD:
[history]
        ENVFILE /etc/xymon/xymonserver.cfg
        NEEDS xymond
        CMD xymond_channel --channel=stachg xymond_history --epochtimestamps
        LOGFILE $XYMONSERVERLOGS/history.log
        PIDFILE $XYMONRUNDIR/xymond_history.pid

We are now storing new records in epoch:

[root at trampoline histlogs]# ls -lt trampoline_local/memory/
total 20
-rw-r--r--. 1 xymon xymon 374 Aug 25 02:12 1692954767
-rw-r--r--. 1 xymon xymon 157 Aug 25 02:12 1692954733
-rw-r--r--. 1 xymon xymon 374 Aug 25 00:55 Fri_Aug_25_00:55:58_2023
-rw-r--r--. 1 xymon xymon 374 Aug 24 15:38 Thu_Aug_24_15:38:30_2023
-rw-r--r--. 1 xymon xymon 374 Aug 24 12:37 Thu_Aug_24_12:37:39_2023

However the main event and event history pages still seem to pull both up
quoted from Jeremy Laidman
fine.

I'm aware of a patch in the Terabithia packages that change the filename
format. But reading through the patch (from the SRPM), it looks like it's
supposed to handle both old and new formats just fine. One of the things
the patch provides is the ability to override the filename format using
either the "--epochtimestamps" switch or setting the "EPOCHIST" variable,
to (I think) tell "xymond_history" which format to use when creating new
history logfiles. Without these, the code should assume write format, but
read either format.
Correct. The intent was to use epoch for all *links* now, but not to
adjust the default output file pattern unless flagged to. (This default
would change in 4.4.) And yes, it needs to be able to read both for sure.
quoted from Jeremy Laidman

So, a couple of things here:

1) With the patch in place, new history log files should be created using
the new filename format. But they're being created with the old format.

2) With the patch in place, existing history log files with filenames in
the old format should still be readable by the CGI scripts. But they
history.sh file is failing when it can't find filenames in the new format.

Running strings on the Xymon binaries doesn't show the strings
"epochtimestamps" or "EPOCHHIST". So I don't think the patch has been
fully
applied. And I don't think I can use the --epochtimestamps option to fix
it.

The only binary that I think the string will end up in is xymond_history
itself. The various lib/*.c sources are only referenced there:

[root at trampoline histlogs]# grep epochtimestamps
/usr/libexec/xymon/xymond_history
Binary file /usr/libexec/xymon/xymond_history matches

If that's not showing up, then the patch definitely isn't applied.


I'm a little perplexed at what could be going on here, though. It's
possible there is a missed bug in here somehow, but this functionality was
added quite a while back.

The biggest drawback for the histlog format (in my mind) was lack of TZ
data in it. Is it possible the epoch is not being converted correctly back
somehow I wonder?

-jc
quoted from Jeremy Laidman

So I'm not sure if I've done something fundamentally wrong here
(probably),
or (less likely) if the patch in the SRPM has not cleanly applied when the
the RPM was built. Does anyone else have the Terabithia 4.3.30-1 package
installed on RHELv7.9, who can tell me what format your history status
files (under /var/lib/xymon/histlog/<hostname>/<testname>) are in? And can
tell me if the history events display correctly?

J

list Henrik Størner · Fri, 25 Aug 2023 11:28:21 +0100 ·
Hi,
I am sure JC has already thought of this. But remember that the history-log clean up utility must also work with the new naming convention.

Regards,
Henrik

Den 25. aug. 2023 kl. 10.23 skrev J.C. Cleaver <user-87556346d4af@xymon.invalid>:
quoted from Jeremy Laidman

?
On Thu, August 24, 2023 19:42, Jeremy Laidman wrote:
Hi y'all

We've setup some new RHEL 7.9 Xymon servers running 4.3.30-1 from the
Terabithia repo. Everything seems to work fine except that the event
history list ("last 50 events") has links that use the epoch time format
for the individual events, but the files in $XYMONHISTLOGS directories
contain files in the old format such as Fri_Aug_25_09:59:03_2023. When I
click on an event dot, I'm given a webpage with the message "Historical
status log not available". If I rename/symlink the file to the epoch time
format (eg 1692921543) then it displays just fine.
I had to double-check this one, but I can confirm that on Terabithia (w/o
any further configuration) the links generated look something like:
http://127.0.0.1:2180/xymon-cgi/historylog.sh?HOST=trampoline.local&SERVICE=memory&TIMEBUF=1692950158

However the files in the directory are:
[root at trampoline histlogs]# find -name Fri_Aug_25_00:55:58_2023
./trampoline_local/cpu/Fri_Aug_25_00:55:58_2023
./trampoline_local/disk/Fri_Aug_25_00:55:58_2023
./trampoline_local/inode/Fri_Aug_25_00:55:58_2023
./trampoline_local/procs/Fri_Aug_25_00:55:58_2023
./trampoline_local/ports/Fri_Aug_25_00:55:58_2023
./trampoline_local/msgs/Fri_Aug_25_00:55:58_2023
./trampoline_local/files/Fri_Aug_25_00:55:58_2023
./trampoline_local/memory/Fri_Aug_25_00:55:58_2023

And they are being pulled up correctly.

After adding --epochtimestamps to the xymond_history CMD:
[history]
       ENVFILE /etc/xymon/xymonserver.cfg
       NEEDS xymond
       CMD xymond_channel --channel=stachg xymond_history --epochtimestamps
       LOGFILE $XYMONSERVERLOGS/history.log
       PIDFILE $XYMONRUNDIR/xymond_history.pid

We are now storing new records in epoch:

[root at trampoline histlogs]# ls -lt trampoline_local/memory/
total 20
-rw-r--r--. 1 xymon xymon 374 Aug 25 02:12 1692954767
-rw-r--r--. 1 xymon xymon 157 Aug 25 02:12 1692954733
-rw-r--r--. 1 xymon xymon 374 Aug 25 00:55 Fri_Aug_25_00:55:58_2023
-rw-r--r--. 1 xymon xymon 374 Aug 24 15:38 Thu_Aug_24_15:38:30_2023
-rw-r--r--. 1 xymon xymon 374 Aug 24 12:37 Thu_Aug_24_12:37:39_2023

However the main event and event history pages still seem to pull both up
fine.

I'm aware of a patch in the Terabithia packages that change the filename
format. But reading through the patch (from the SRPM), it looks like it's
supposed to handle both old and new formats just fine. One of the things
the patch provides is the ability to override the filename format using
either the "--epochtimestamps" switch or setting the "EPOCHIST" variable,
to (I think) tell "xymond_history" which format to use when creating new
history logfiles. Without these, the code should assume write format, but
read either format.
Correct. The intent was to use epoch for all *links* now, but not to
adjust the default output file pattern unless flagged to. (This default
would change in 4.4.) And yes, it needs to be able to read both for sure.

So, a couple of things here:

1) With the patch in place, new history log files should be created using
the new filename format. But they're being created with the old format.

2) With the patch in place, existing history log files with filenames in
the old format should still be readable by the CGI scripts. But they
history.sh file is failing when it can't find filenames in the new format.

Running strings on the Xymon binaries doesn't show the strings
"epochtimestamps" or "EPOCHHIST". So I don't think the patch has been
fully
applied. And I don't think I can use the --epochtimestamps option to fix
it.

The only binary that I think the string will end up in is xymond_history
itself. The various lib/*.c sources are only referenced there:

[root at trampoline histlogs]# grep epochtimestamps
/usr/libexec/xymon/xymond_history
Binary file /usr/libexec/xymon/xymond_history matches

If that's not showing up, then the patch definitely isn't applied.


I'm a little perplexed at what could be going on here, though. It's
possible there is a missed bug in here somehow, but this functionality was
added quite a while back.

The biggest drawback for the histlog format (in my mind) was lack of TZ
data in it. Is it possible the epoch is not being converted correctly back
somehow I wonder?

-jc

So I'm not sure if I've done something fundamentally wrong here
(probably),
or (less likely) if the patch in the SRPM has not cleanly applied when the
the RPM was built. Does anyone else have the Terabithia 4.3.30-1 package
installed on RHELv7.9, who can tell me what format your history status
files (under /var/lib/xymon/histlog/<hostname>/<testname>) are in? And can
tell me if the history events display correctly?

J

list Japheth Cleaver · Sat, 26 Aug 2023 00:53:29 -0700 ·
quoted from Henrik Størner
On Fri, August 25, 2023 03:28, Henrik St??rner wrote:
Hi,
I am sure JC has already thought of this. But remember that the
history-log clean up utility must also work with the new naming
convention.

Regards,
Henrik
Alas, I believe this was a miss of mine.

Fortunately, I believe
https://sourceforge.net/p/xymon/code/HEAD/tree/branches/4.x-master/xymond/trimhistory.c#l335
prevents overtrimming for an "invalid" file name, but this does need to be
added in properly. Thanks.

Jeremy: Might you be able to try adjusting a file name as +/- one hour?
I'm wondering if DST calculations might be throwing the lookup off here.

Regards,
-jc
list Jeremy Laidman · Mon, 28 Aug 2023 15:55:31 +1000 ·
quoted from Japheth Cleaver
On Sat, 26 Aug 2023 at 17:53, J.C. Cleaver <user-87556346d4af@xymon.invalid> wrote:
Jeremy: Might you be able to try adjusting a file name as +/- one hour?
I'm wondering if DST calculations might be throwing the lookup off here.
OK, so mystery solved. This was a self-inflicted injury. Sorry to waste
peoples' time.

It turns out that some of the binaries installed from the Terabithia
package had been replaced (but not by me) by binaries of the same code
release (4.3.30) but compiled from source without any of the patches
provided in the Terabithia .src.rpm file.

While I haven't yet reverted the binaries to the original, I have
confirmed that this would likely fix the issue I was having. If I execute
the original CGI binary (which was renamed to *.bak) with suitable
QUERY_STRING then I get the correct results instead of the error message.
And in particular, if I set the TIMEBUF to be the penultimate entry in the
hist/<hostname>.<servicename> file, and run the svcstatus.cgi.bak file
under strace, then I can see that it's trying the new format, and then
reverting to the old format, in a way that matches the old-style filename
in histlogs, which have names relative to my local timezone.

$ SCRIPT_NAME= REQUEST_METHOD=GET
QUERY_STRING="HOST=$HOST&SERVICE=$SERVICE&TIMEBUF=$TIMEBUF" strace -f
/usr/share/xymon/bin/svcstatus.cgi.bak --no-svcid --historical 2>&1 | grep
histlogs
stat("/var/lib/xymon/histlogs/<hostname-redacted>/procs/1693158425",
0x7ffcade28c80) = -1 ENOENT (No such file or directory)
stat("/var/lib/xymon/histlogs/<hostname-redacted>/procs/Mon_Aug_28_03:47:05_2023",
{st_mode=S_IFREG|0644, st_size=48102, ...}) = 0
open("/var/lib/xymon/histlogs/<hostname-redacted>/procs/Mon_Aug_28_03:47:05_2023",
O_RDONLY) = 3

(Side note: if I don't define SCRIPT_NAME when running historylog.sh, I get
a segfault in svcstatus.c/parse_query() line 154 at the SBUF_MALLOC.)

I wasn't aware of the substituted binary. Apparently, the reason for this
was to work around a seg fault in do_request() that occurred when debugging
was enabled. The work-around was to double the memory allocated for
"clienturi" when history.sh creates its links to svcstatus.sh (from 1024 to
2048). I'll investigate this a bit more and might have a patch to apply one
day soon...

Again, sorry for the false alarm, and thanks for helping me to narrow down
the search.

J