Xymon Mailing List Archive search

Tricky one for log file monitoring

list Henrik Størner
Thu, 22 Mar 2012 12:11:01 +0100
Message-Id: <0457e9591dc834825cb209d46ce2e8fd@localhost>

On Thu, 22 Mar 2012 10:41:09 -0000, "Neil Simmonds"
<user-8188d25e65e4@xymon.invalid> wrote:
Message appears in log file for failure - from this we want an alert
that will stay active and not expire after 30 minutes like log file
alerts usually do.

We will hopefully then get a message in the log file that tells us of
completion of the failed process, at this point we want to clear the
alert.
It's not something that the Xymon client will do automatically, but you
can script your way out of it. What I would do is to create a custom test
for this - something like this:

#!/bin/sh

# Logfile we monitor
FN="/var/log/mylogfile"
# Message patterns that say "alert" or "OK"
ALERTMSG="Something bad"
OKMSG="All OK"

# Use the data from the "logfetch" status to grab the last 5 minutes of
log data
FPOS=`cat $XYMONTMP/logfetch.${MACHINEDOTS}.status | grep "^${FN}:" | cut
-d: -f2`
LASTMSG=`dd if=$FN bs=1 skip=$FPOS 2>/dev/null | egrep "$ALERTMSG|$OKMSG"
| tail -n 1`

# LASTMSG now holds the last message which is either an alert or an OK
message
#
# Actually the whole "cat ... grep ... cut ... dd .." thing is not needed,
since 
# you could just scan the entire logfile and pick out the last message
which is 
# either OK or alert... you could just do
# LASTMSG=`egrep "$ALERTMSG|$OKMSG" $FN | tail -n 1`

# Determine color
COLOR="green"
if test `echo "$LASTMSG" | grep -c "$ALERTMSG"` -ne 0
then
   COLOR=red
fi

# Send the status with a very long duration so it doesnt go purple.
$XYMON $XYMSRV "status+365d $MACHINE.mylog $COLOR `date`

Last message seen: $LASTMSG
"

exit 0


This raises two interesting ideas:

1) We should have status-messages that don't expire (go purple). Using a
very long status lifetime is a kludge, really.
2) The log analysis tool should know how to handle messages that cancel
each other out.


Regards,
Henrik