Xymon Mailing List Archive search

monitoring contents of a logfile with a daily changing filename

10 messages in this thread

list Ian Diddams · Wed, 15 Aug 2018 14:37:20 +0000 (UTC) ·
xymon 4.3.17 / 4.3.28 /  4.3.15 / 4.3.12 /    (multiple servers)
centos 6 & 7
For a client Ive been asked to set up a reasonably simple LOG file check in their various xymon installations.

Each client to be checked has daily changing messages filenames (via rsyslog.conf).  The filename for today is for example
/var/log/external/<client hostname>/messages-20180815.log

i.e. it is always in /var/log/external/<client hostname>/ with the formaltted filename of messages-<YYYYMMDD>.log
so presumably via the xymon server's analysis.cfg and client-local.cfg I have to be able to tell xymon server to use the logfile with naming convention of 

/var/log/external/<client hostname>/messages-<YYYYMMDD>.log

any ideas how to manage thiis?

the only other way I can think to do it is to have a cron that automagically resets a softlink of eg /var/log/messages to the daily log, run at midnight on every client - but that's just another level of cludginess.that requires some overseeing to ensure every client "works".  This really needs to be done centrally.

Summary:  logfile paths have two quasi-random elements - the client hostname in the path, and the date in the logfile name
didds
list Damien Martins · Wed, 15 Aug 2018 19:25:47 +0200 ·
Hi Ian,

My suggestion (not tested):
-On your client, edit ~xymon/client/etc/client-local.cfg file
-In your host definition, add a this line:
log:/var/log/external/$REPLACE_WITH_CLIENT_NAME/messages-$(date +%Y%m%d).log
-If previous does not work, this one should do the job
log:$(find /var/log/external/$REPLACE_WITH_CLIENT_NAME/ -name 
messages-$(date +%Y%m%d).log)

Regards,
Damien Martins

Le 15/08/2018 à 16:37, Ian Diddams via Xymon a écrit :
list Ian Diddams · Thu, 16 Aug 2018 14:40:27 +0000 (UTC) ·
 Ok - another angle.  I feel I am SO close.
so I have a cleint with message logs with filename format
/var/log/messages-YYYYMMDD.log
It contains a trigger word DIDDS
client-local.cfg on the xymon SERVER contains

[linux]log:/var/log/messages:10240log:`find /var/log -maxdepth 1 -type f -name messages-\*.log`:10240log:/var/log/maillog:10240
log:/var/log/secure:10240ignore MARK

The client's msgs GUI page shows

No entries in /var/log/messagesNo entries in /var/log/messages-20180816.log
No entries in /var/log/maillog
No entries in /var/log/secure


Full log /var/log/messagesFull log /var/log/messages-20180816.log
Full log /var/log/maillog
Full log /var/log/secure


ie it can find/knows about that respective messages file.

However...

in analysis.cfg, for the respective client this line
 LOG %/var/log/messages*.log "DIDDS"  COLOR=yellow

doesn't flag anything - even if the string DIDDS is in that messages-20180816.log file ..
hence the line in the GUI
No entries in /var/log/messages-20180816.log


SO CLOSE.

what am I missing here?


Because if I merely use
LOG %/var/log/messages "DIDDS"  COLOR=yellow
with DIDDS within /var/log/messages  it goes yellow almost immediately.
???
didds
list Ian Diddams · Thu, 16 Aug 2018 14:49:07 +0000 (UTC) ·
 further to the below...


form the analysis.cfg man page:


LOG logfilename pattern [COLOR=color] [IGNORE=excludepattern] [OPTIONAL]

...
"logfilename" is the name of the logfile. Only logentries from this filename will be matched against this rule. Note that "logfilename" can be a regular expression (if prefixed with a '%' character). 

as below the entry for the client in analysis.cfg on the server is
 LOG %/var/log/messages*.log "DIDDS"  COLOR=yellow

so IS prefixed by a %
and the proof thyat this isn;t picking up the contents of the requisite log file is because the GUI page line
Full log /var/log/messages-20180816.log

does not have 
<...CURRENT...>DIDDS
below it - as my test for plain /var/log/messages does.
didds
quoted from Ian Diddams

    On Thursday, 16 August 2018, 15:40:44 BST, Ian Diddams via Xymon <xymon at xymon.com> wrote:  
 
  Ok - another angle.  I feel I am SO close.
so I have a cleint with message logs with filename format
/var/log/messages-YYYYMMDD.log
It contains a trigger word DIDDS
client-local.cfg on the xymon SERVER contains

[linux]log:/var/log/messages:10240log:`find /var/log -maxdepth 1 -type f -name messages-\*.log`:10240log:/var/log/maillog:10240
log:/var/log/secure:10240ignore MARK

The client's msgs GUI page shows

No entries in /var/log/messagesNo entries in /var/log/messages-20180816.log
No entries in /var/log/maillog
No entries in /var/log/secure


Full log /var/log/messagesFull log /var/log/messages-20180816.log
Full log /var/log/maillog
Full log /var/log/secure


ie it can find/knows about that respective messages file.

However...

in analysis.cfg, for the respective client this line
 LOG %/var/log/messages*.log "DIDDS"  COLOR=yellow

doesn't flag anything - even if the string DIDDS is in that messages-20180816.log file ..
hence the line in the GUI
No entries in /var/log/messages-20180816.log


SO CLOSE.

what am I missing here?


Because if I merely use
LOG %/var/log/messages "DIDDS"  COLOR=yellow
with DIDDS within /var/log/messages  it goes yellow almost immediately.
???
didds
list Ian Diddams · Thu, 16 Aug 2018 14:57:08 +0000 (UTC) ·
 well...


Ive really no idea what is happenbing now!

NOW the GUI page shows
quoted from Ian Diddams

No entries in /var/log/messagesNo entries in /var/log/messages-20180816.log
No entries in /var/log/maillog
No entries in /var/log/secure

Full log /var/log/messages
Full log /var/log/messages-20180816.log
<...CURRENT...>DIDDSFull log /var/log/maillog
Full log /var/log/secure

i.e. it IS showing the contents of messages-20180816.log.  So 

1) it knows about the correct log
2) it has the log files contents
but
3) it is failing to note that it contains the trigger word.

Summary:

server side client-local.cfg :   log:`find /var/log -maxdepth 1 -type f -name messages-\*.log`:10240server side analysis.cfg :       LOG %/var/log/messages*.log "DIDDS"  COLOR=yellow
servier side must work because it worked for the sijmple test again /var/log/messages
quoted from Ian Diddams
didds


    On Thursday, 16 August 2018, 15:49:07 BST, Ian Diddams <user-7fbf34ed5219@xymon.invalid> wrote:  
 
  further to the below...


form the analysis.cfg man page:


LOG logfilename pattern [COLOR=color] [IGNORE=excludepattern] [OPTIONAL]

...
"logfilename" is the name of the logfile. Only logentries from this filename will be matched against this rule. Note that "logfilename" can be a regular expression (if prefixed with a '%' character). 

as below the entry for the client in analysis.cfg on the server is
 LOG %/var/log/messages*.log "DIDDS"  COLOR=yellow

so IS prefixed by a %
and the proof thyat this isn;t picking up the contents of the requisite log file is because the GUI page line
Full log /var/log/messages-20180816.log

does not have 
<...CURRENT...>DIDDS
below it - as my test for plain /var/log/messages does.
didds

    On Thursday, 16 August 2018, 15:40:44 BST, Ian Diddams via Xymon <xymon at xymon.com> wrote:  
 
  Ok - another angle.  I feel I am SO close.
so I have a cleint with message logs with filename format
/var/log/messages-YYYYMMDD.log
It contains a trigger word DIDDS
client-local.cfg on the xymon SERVER contains

[linux]log:/var/log/messages:10240log:`find /var/log -maxdepth 1 -type f -name messages-\*.log`:10240log:/var/log/maillog:10240
log:/var/log/secure:10240ignore MARK

The client's msgs GUI page shows

No entries in /var/log/messagesNo entries in /var/log/messages-20180816.log
No entries in /var/log/maillog
No entries in /var/log/secure


Full log /var/log/messagesFull log /var/log/messages-20180816.log
Full log /var/log/maillog
Full log /var/log/secure


ie it can find/knows about that respective messages file.

However...

in analysis.cfg, for the respective client this line
 LOG %/var/log/messages*.log "DIDDS"  COLOR=yellow

doesn't flag anything - even if the string DIDDS is in that messages-20180816.log file ..
hence the line in the GUI
No entries in /var/log/messages-20180816.log


SO CLOSE.

what am I missing here?


Because if I merely use
LOG %/var/log/messages "DIDDS"  COLOR=yellow
with DIDDS within /var/log/messages  it goes yellow almost immediately.
???
didds
list Ian Diddams · Thu, 16 Aug 2018 15:08:05 +0000 (UTC) ·
 Sorry Andy - I'm not with you?

"Convert your rule in analysis.cfg from a glob to a regex :-

LOG %/var/log/messages.*.log "DIDDS"|  COLOR=yellow"
The file name is messages-YYMMDD.log   - what you have above seems to be
mesages.<something>.log  ?   and the pipe  ie "|" ?
sorry to be so thick !

didds
list Mike Burger · Thu, 16 Aug 2018 11:08:20 -0400 ·
Hello, Ian.

Please note that it's only monitoring the most recent 10240 bytes of the logs in question. This means that unless the text you're looking for is in the last 10240 bytes of the log, it will not flag.

On 2018-08-16 10:40, Ian Diddams via Xymon wrote:
-- 
Mike Burger
http://www.bubbanfriends.org

"It's always suicide-mission this, save-the-planet that. No one ever just stops by to say 'hi' anymore." --Colonel Jack O'Neill, SG1
list Ian Diddams · Thu, 16 Aug 2018 15:40:08 +0000 (UTC) ·
 I realised the solution came about via direct emails with Andy so here is the solution for posterity:
Now the LOG config in analysis.cfg is simply :LOG %/var/log/messages.*.log "XXXX"  COLOR=yellow
And the accompanying client-local.cfg entry is :[linux]log:/var/log/messages:10240log:`find /var/log -maxdepth 1 -type f -name messages-\*.log`:10240000000
The requirement was the regex wildcar .* not the glob of just *.  

Next question...

I now see if I place the trigger word DIDDS in an appropriate logfile xymon does spot it and  alert on it, and the GUI page shows the contents of the logfile as 
<CURRENT>
DIDDS

But after 10 minutes...  it drops the alert and the gui page no longer also lists those contents - in fact it shows nothing as the contents.

Anybody any ideas as to why this behaviour is seen.  These tests are at the moment using dummy logfilesd which have no other info in them.  It is literally only the line "DIDDS" which remains there.

If I add another line DIDDS once again it spots it and then alerts and shows it as 
<CURRENT>
DIDDS.

For ten minutes - then it all drops again...

any ideas why this is occurring?  

didds
list Schminke_Erik_D · Fri, 17 Aug 2018 11:58:40 -0500 ·
Hang on.  Thats not entirely correct.  Xymon does not look at "the last
SIZE bytes of the log file".  Through a coincidence, it might.. but that's
what it does.  The rules that govern what gets returned is a little more
complicated, but important to understand to avoid tearing out all your hair
while troubleshooting.

The SIZE component of the LOG entry only specifies the maximum amount of
data to send back to the server.  The logfetch program on the client side
will take the last 30 minutes (kinda) of the file into consideration for
what it sends back.  An IGNORE rule removes lines from consideration (will
not be sent, will not count against the max SIZE).  Then, TRIGGER rules
will send all matched lines even if it exceeds max SIZE.  If what was found
by any TRIGGER rules is less than max SIZE, it will include the difference
from any remaining lines, up to the max SIZE.  Still, only the last 30
minutes (kinda) are considered.

I say kinda, because the logfetch program works like this.  Every time the
logfetch program checks a log file, it takes note of the current size of
the log file.  It keeps track of this in the STATUSFILE.  (See logfetch
manpage).  Each line of the STATUSFILE lists the log files it's watching
followed by a "queue" of numbers.  Those numbers represent the size of the
log file at the last 6 times it was checked.  Every time logfetch runs, it
unshifts the current size of the log file onto the front of the queue and
pops the last number off the end of the queue.  Then, logfetch opens the
log file, seeks to the byte number that it popped off the queue, and reads
to the end of file.  So, log fetch returns the last "6 * check interval"
minutes worth of entries in the log.  Check interval is USUALLY 5 minutes,
hence the 30 minutes.

If it's not returning what you're expecting to get back from the logs, it's
most likely due to how logfetch only concerns itself with that "last 6
checks" worth of the log.


Erik D. Schminke | Associate Systems Programmer
Hormel Foods Corporation | One Hormel Place | Austin, MN XXXXX
Phone: (XXX) XXX-XXXX
user-15513f33c451@xymon.invalid | www.hormelfoods.com
list Mike Burger · Fri, 17 Aug 2018 13:37:36 -0400 ·
Thank you for correcting my understanding...I appreciate it.
quoted from Schminke_Erik_D

On 2018-08-17 12:58, user-15513f33c451@xymon.invalid wrote:
Hang on.  Thats not entirely correct.  Xymon does not look at "the last
SIZE bytes of the log file".  Through a coincidence, it might.. but that's
what it does.  The rules that govern what gets returned is a little more
complicated, but important to understand to avoid tearing out all your hair
while troubleshooting.

The SIZE component of the LOG entry only specifies the maximum amount of
data to send back to the server.  The logfetch program on the client side
will take the last 30 minutes (kinda) of the file into consideration for
what it sends back.  An IGNORE rule removes lines from consideration (will
not be sent, will not count against the max SIZE).  Then, TRIGGER rules
will send all matched lines even if it exceeds max SIZE.  If what was found
by any TRIGGER rules is less than max SIZE, it will include the difference
from any remaining lines, up to the max SIZE.  Still, only the last 30
minutes (kinda) are considered.

I say kinda, because the logfetch program works like this.  Every time the
logfetch program checks a log file, it takes note of the current size of
the log file.  It keeps track of this in the STATUSFILE.  (See logfetch
manpage).  Each line of the STATUSFILE lists the log files it's watching
followed by a "queue" of numbers.  Those numbers represent the size of the
log file at the last 6 times it was checked.  Every time logfetch runs, it
unshifts the current size of the log file onto the front of the queue and
pops the last number off the end of the queue.  Then, logfetch opens the
log file, seeks to the byte number that it popped off the queue, and reads
to the end of file.  So, log fetch returns the last "6 * check interval"
minutes worth of entries in the log.  Check interval is USUALLY 5 minutes,
hence the 30 minutes.

If it's not returning what you're expecting to get back from the logs, it's
most likely due to how logfetch only concerns itself with that "last 6
checks" worth of the log.
-- 
Mike Burger
http://www.bubbanfriends.org

"It's always suicide-mission this, save-the-planet that. No one ever just stops by to say 'hi' anymore." --Colonel Jack O'Neill, SG1