system log and application log monitoring
list Olivier Beau
Hi, Up to now i didnt have time to look at hobbit "non-stable"; I started playing with "msgs" log reporting.. henrik's work is remarkable :) (this might have already been discussed on the list) as for me ,system log monitoring should be a default, which is not the case for my network... well.. i was glad to find OS log files definitions in client-local.cfg Could there be basic OS pattern definitions in hobbit-client.cfg's DEFAULT ? next step: application log monitoring let's say i have 100 servers (differents OS of course) running mysql, and i want to follow "ended" in /var/log/mysqld.log ->setting up 100 entries in client-local.cfg doesn't seem great, could there be some kind of grouping in client-local.cfg (PAGE actually..) ? (i guess this would required to processing client-local.cfg before transferring to the clients..) Olivier -- Olivier Beau
list Henrik Størner
▸
On Sun, May 21, 2006 at 07:29:49PM +0200, Olivier Beau wrote:
well.. i was glad to find OS log files definitions in client-local.cfg Could there be basic OS pattern definitions in hobbit-client.cfg's DEFAULT ?
You'll have to contribute some, then. I don't really know what people are looking for in their logfiles.
▸
next step: application log monitoring let's say i have 100 servers (differents OS of course) running mysql, and i want to follow "ended" in /var/log/mysqld.log ->setting up 100 entries in client-local.cfg doesn't seem great, could there be some kind of grouping in client-local.cfg (PAGE actually..) ? (i guess this would required to processing client-local.cfg before transferring to the clients..)
Welcome to the world of configuration "classes". Step 1: Put a "CLASS:mysqlservers" on those hosts in bb-hosts. Step 2: Put a section in your client-local.cfg file with [mysqlservers] logfile:/var/log/mysql/status.log Step 3: Configure hobbit-clients.cfg for these logfiles. Only problem is that you'll need todays snapshot for this to work. Regards, Henrik
list Jeff Newman
Henrik,
Is there a facility already in place, or a way to graph the number of "hits"
returned by a pattern match for a log file?
For instance:
I am checking xyz log file for the word "wrap" It would be *very* useful to have
a graph that shows the number of times that word showed up between the previous
check and the current check.
This could be very useful to illustrate, say, a disk dying (one blip
of a bad read or something would be one thing, but looking at a graph
over time that shows 1 blip one week, 10 the next, and 20 the week
after that would indicate the disk was almost dead) etc...
Right now, the only way I have to do this is with a client side script that
runs in a constant loop:
while true; do
NUM=`grep "Buffer wrapped" /quotes/env/errlog | wc -l | sed 's/ *//g'`
if [ $NUM -gt $INITIALNUM ] ; then
WRAP_NUM=`expr $NUM - $INITIALNUM`
$BB $BBDISP "status $MACHINE.wraps green `date`
`echo "wraps:$WRAP_NUM"`
"
INITIALNUM=$NUM
else
OKNUM=0
$BB $BBDISP "status $MACHINE.wraps green `date`
`echo "wraps:$OKNUM"`
"
fi
-Jeff
▸
On 5/28/06, Henrik Stoerner <user-ce4a2c883f75@xymon.invalid> wrote:On Sun, May 21, 2006 at 07:29:49PM +0200, Olivier Beau wrote:well.. i was glad to find OS log files definitions in client-local.cfg Could there be basic OS pattern definitions in hobbit-client.cfg's DEFAULT ?You'll have to contribute some, then. I don't really know what people are looking for in their logfiles.next step: application log monitoring let's say i have 100 servers (differents OS of course) running mysql, and i want to follow "ended" in /var/log/mysqld.log ->setting up 100 entries in client-local.cfg doesn't seem great, could there be some kind of grouping in client-local.cfg (PAGE actually..) ? (i guess this would required to processing client-local.cfg before transferring to the clients..)Welcome to the world of configuration "classes". Step 1: Put a "CLASS:mysqlservers" on those hosts in bb-hosts. Step 2: Put a section in your client-local.cfg file with [mysqlservers] logfile:/var/log/mysql/status.log Step 3: Configure hobbit-clients.cfg for these logfiles. Only problem is that you'll need todays snapshot for this to work. Regards, Henrik
list Henrik Størner
▸
On Fri, Jun 02, 2006 at 11:03:52AM -0500, Jeff Newman wrote:
Is there a facility already in place, or a way to graph the number of "hits" returned by a pattern match for a log file? For instance: I am checking xyz log file for the word "wrap" It would be *very* useful to have a graph that shows the number of times that word showed up between the previous check and the current check.
No, there isn't.
▸
This could be very useful to illustrate, say, a disk dying (one blip of a bad read or something would be one thing, but looking at a graph over time that shows 1 blip one week, 10 the next, and 20 the week after that would indicate the disk was almost dead) etc...
Hobbit only looks at log entries over a 30-minute period, so we would have to extend that significantly. So this would have to be done at the client side rather than on the server. (Not a problem, I'm just thinking out loud).
▸
Right now, the only way I have to do this is with a client side script that
runs in a constant loop:
while true; do
NUM=`grep "Buffer wrapped" /quotes/env/errlog | wc -l | sed 's/ *//g'`
if [ $NUM -gt $INITIALNUM ] ; then
WRAP_NUM=`expr $NUM - $INITIALNUM`
$BB $BBDISP "status $MACHINE.wraps green `date`
`echo "wraps:$WRAP_NUM"`
"
INITIALNUM=$NUM
else
OKNUM=0
$BB $BBDISP "status $MACHINE.wraps green `date`
`echo "wraps:$OKNUM"`
"
fi
If all that you want is the graph and not alerts, then I wonder if it
couldn't be done more easily. Just do the "grep" and report the number
like you do now. Then send it into the NCV handler, with a dataset
definition that uses the DERIVE datatype (which is the default, btw).
Then RRDtool should handle all of the "subtract current value from
previous value if it's greater, else ..." stuff and you needn't
worry about it.
.....
After thinking a bit more about this, I believe that having a method to
do "grep ...| wc -l" in the client might be a good thing. So I've added
a new type of configuration the the client-local.cfg file, so you can do
linecount:/var/log/messages
diskerrors I/O error.*/dev/hd
badlogins Login failed
and it will report back in the client message the data
diskerrors: 0
badlogins: 2
which are the number of times these two expressions were found in the
/var/log/messages file.
Given those data, on the server side it will be easy to feed them into
a graph and do other nice things with it.
Regards,
Henrik
list Henrik Størner
▸
On Sun, Jun 04, 2006 at 10:04:44AM +0200, Henrik Stoerner wrote:
and it will report back in the client message the data diskerrors: 0 badlogins: 2 which are the number of times these two expressions were found in the /var/log/messages file. Given those data, on the server side it will be easy to feed them into a graph and do other nice things with it.
The graphs are now created by default, so you can track the trend of how often those lines are logged. Regards, Henrik