Xymon Mailing List Archive search

Logfile monitoring - I'd like some comments

23 messages in this thread

list Henrik Størner · Tue, 14 Feb 2006 22:39:48 +0100 ·
A few days ago, I mentioned that I would like to do logfile
monitoring for the next Hobbit release.

I've worked a bit on this and have a prototype solution for
it, which you can test with the current snapshots. I'd like
some comments on how it works to make sure I haven't overlooked
something before committing myself.

There are several objectives:
- As far as is possible, logfile monitoring must be configured
  centrally, on the Hobbit server. Having to go to each server
  to (re)configure what logfiles to check and what to look for
  simply doesn't work.
- The amount of data sent from each client to Hobbit should be
  small, but it must catch the "important" stuff.
- You rarely know in advance what will be in the logs when you
  need them the most. So the monitor should give you as much
  of the log entries as possible, not just those lines that
  match some pre-defined strings or regex'es.
- Some systems log messages on multiple lines. The system must
  be able to show all parts of a log entry.
- Logfile entries must appear on the monitor for some time after
  they show up in the logs, but should also disappear after a
  while.

In other words: The ideal solution would let you have the entire
logfile available on the Hobbit server - but that obviously 
won't work. So the client should - after weeding out the really 
irrelevant stuff - send us as much of each logfile as possible.

My proposed solution is this:
- On the Hobbit server, there's a log-monitoring configuration
  file for the Hobbit clients. This defines which logfiles are
  monitored for a single client installation, or you can define
  it for a group of clients. (The idea is to define at least 
  one group for each operating system, since the standard
  system logs are OS dependant). This configuration lists the
  log filename, the maximum amount of data to send from this
  logfile, a regex "noise" filter (i.e. lines that are stripped
  from the logfile), and *optionally* a regex identifying really
  interesting stuff in the logfile that should always be 
  reported.
- When a client connects to the Hobbit server and sends the
  normal client message, the Hobbit server will respond with
  the logfile configuration for this client. So the client
  has a copy of the central configuration file, but only the
  part that it needs for itself. The reason for sending this
  as a response to the client message is to avoid an extra
  round-trip from client to server; piggy-backing the config
  push on the client message means that it is almost without
  any performance cost on the server side.
- When the client runs, it uses the local copy of the configuration
  file to determine what logs to look at. For each logfile, it
  maintains a "where-was-I-the-last-time" status, so it only
  looks at the entries made to the logfile during the past 30
  minutes. First, the client strips off any "noise" messages.
  Then, if all of the entries fit into the maximum size that
  can be reported, it sends all of the log to the Hobbit server.
  If there is more than will fit, it first checks to see of the
  regex defining the really interesting stuff is present in the
  log - if it is, then it drops anything before the interesting
  text. If there is still more than will fit, it keeps the
  interesting text + a few lines after that (to allow for
  multi-line log-entries which some OS'es have), and then
  sends that together with as much of last part the log as will
  fit inside the max. message size.

This part has been implemented in the Hobbit daemon (hobbitd),
and in the clients via a new "logfetch" utility. This utility
uses standard regular expressions - not the Perl-compatible
ones, because that would require you to install the PCRE
library on all of your clients. The standard regex routines
are included in all (I think) system libraries used today.

The last part is what happens when the log data arrives on the
Hobbit server. Currently, there's a simple processing of this
data to just dump it into an always-green "msgs" column. What
should happen once I get it coded is:
- Data from each logfile is matched against a set of strings 
  (regex'es) defined in the hobbit-clients.cfg file. Each string 
  determines the color (red, yellow, green) and sets the color
  of the msgs column accordingly.

When the color has been decided, all of the normal alerting
happens automatically. I do plan on making a more fine-grained
alert mechanism (for the msgs, procs and disk statuses) so you
can direct alerts to different groups depending on exactly 
which log-message triggered the alert, but that will not be
part of this release.


So - how does that sound ? Anything I've missed ?


Regards,
Henrik
list Andrew Oldaker · Tue, 14 Feb 2006 15:47:56 -0600 ·
While I'm sure there will be some extensive feedback on this issue for
days to come, I would like to commend Henrick on the simple fact that we
are finally looking at an opportunity to remove the third-party client
side logfile scripts (most of us are using) from our Hobbit
installations and get back toward centralized log monitoring.

-AJO

Andrew J Oldaker
WDT Unix Systems Support
list Larry Barber · Tue, 14 Feb 2006 16:56:17 -0600 ·
The feature of the client retrieving its configuration from the server might
cause some problems with firewalls, and security types might not be willing
to open firewalls for such infrequent messages.  I like the idea of having
centralized configuration, but it would be nice if it could be implemented
without upsetting the firewall admins. It also eliminates the pure "push" of
the BigBrother client, which is also something security types like. It might
be a good idea to have an option for local client configuration.

Thanks,
Larry Barber
quoted from Henrik Størner

On 2/14/06, Henrik Stoerner <user-ce4a2c883f75@xymon.invalid> wrote:
A few days ago, I mentioned that I would like to do logfile
monitoring for the next Hobbit release.

I've worked a bit on this and have a prototype solution for
it, which you can test with the current snapshots. I'd like
some comments on how it works to make sure I haven't overlooked
something before committing myself.

There are several objectives:
- As far as is possible, logfile monitoring must be configured
  centrally, on the Hobbit server. Having to go to each server
  to (re)configure what logfiles to check and what to look for
  simply doesn't work.
- The amount of data sent from each client to Hobbit should be
  small, but it must catch the "important" stuff.
- You rarely know in advance what will be in the logs when you
  need them the most. So the monitor should give you as much
  of the log entries as possible, not just those lines that
  match some pre-defined strings or regex'es.
- Some systems log messages on multiple lines. The system must
  be able to show all parts of a log entry.
- Logfile entries must appear on the monitor for some time after
  they show up in the logs, but should also disappear after a
  while.

In other words: The ideal solution would let you have the entire
logfile available on the Hobbit server - but that obviously
won't work. So the client should - after weeding out the really
irrelevant stuff - send us as much of each logfile as possible.

My proposed solution is this:
- On the Hobbit server, there's a log-monitoring configuration
  file for the Hobbit clients. This defines which logfiles are
  monitored for a single client installation, or you can define
  it for a group of clients. (The idea is to define at least
  one group for each operating system, since the standard
  system logs are OS dependant). This configuration lists the
  log filename, the maximum amount of data to send from this
  logfile, a regex "noise" filter (i.e. lines that are stripped
  from the logfile), and *optionally* a regex identifying really
  interesting stuff in the logfile that should always be
  reported.
- When a client connects to the Hobbit server and sends the
  normal client message, the Hobbit server will respond with
  the logfile configuration for this client. So the client
  has a copy of the central configuration file, but only the
  part that it needs for itself. The reason for sending this
  as a response to the client message is to avoid an extra
  round-trip from client to server; piggy-backing the config
  push on the client message means that it is almost without
  any performance cost on the server side.
- When the client runs, it uses the local copy of the configuration
  file to determine what logs to look at. For each logfile, it
  maintains a "where-was-I-the-last-time" status, so it only
  looks at the entries made to the logfile during the past 30
  minutes. First, the client strips off any "noise" messages.
  Then, if all of the entries fit into the maximum size that
  can be reported, it sends all of the log to the Hobbit server.
  If there is more than will fit, it first checks to see of the
  regex defining the really interesting stuff is present in the
  log - if it is, then it drops anything before the interesting
  text. If there is still more than will fit, it keeps the
  interesting text + a few lines after that (to allow for
  multi-line log-entries which some OS'es have), and then
  sends that together with as much of last part the log as will
  fit inside the max. message size.

This part has been implemented in the Hobbit daemon (hobbitd),
and in the clients via a new "logfetch" utility. This utility
uses standard regular expressions - not the Perl-compatible
ones, because that would require you to install the PCRE
library on all of your clients. The standard regex routines
are included in all (I think) system libraries used today.

The last part is what happens when the log data arrives on the
Hobbit server. Currently, there's a simple processing of this
data to just dump it into an always-green "msgs" column. What
should happen once I get it coded is:
- Data from each logfile is matched against a set of strings
  (regex'es) defined in the hobbit-clients.cfg file. Each string
  determines the color (red, yellow, green) and sets the color
  of the msgs column accordingly.

When the color has been decided, all of the normal alerting
happens automatically. I do plan on making a more fine-grained
alert mechanism (for the msgs, procs and disk statuses) so you
can direct alerts to different groups depending on exactly
which log-message triggered the alert, but that will not be
part of this release.


So - how does that sound ? Anything I've missed ?


Regards,
Henrik

list Asif Iqbal · Tue, 14 Feb 2006 18:08:46 -0500 ·
quoted from Larry Barber
On Tue, Feb 14, 2006 at 04:56:17PM, Larry Barber wrote:
The feature of the client retrieving its configuration from the server might
cause some problems with firewalls, and security types might not be willing
BB Client currently uses the config feature to pull info from hobbit
server. It does not require an inbound connection rule for the client to
pull data from hobbit server. So if you currently use BB client to
push the client info to hobbit server on default destination port 1984 then you
should be good I think
quoted from Larry Barber
to open firewalls for such infrequent messages.  I like the idea of having
centralized configuration, but it would be nice if it could be implemented
without upsetting the firewall admins. It also eliminates the pure "push" of
the BigBrother client, which is also something security types like. It might
be a good idea to have an option for local client configuration.

Thanks,
Larry Barber

On 2/14/06, Henrik Stoerner <user-ce4a2c883f75@xymon.invalid> wrote:
A few days ago, I mentioned that I would like to do logfile
monitoring for the next Hobbit release.

I've worked a bit on this and have a prototype solution for
it, which you can test with the current snapshots. I'd like
some comments on how it works to make sure I haven't overlooked
something before committing myself.

There are several objectives:
- As far as is possible, logfile monitoring must be configured
  centrally, on the Hobbit server. Having to go to each server
  to (re)configure what logfiles to check and what to look for
  simply doesn't work.
- The amount of data sent from each client to Hobbit should be
  small, but it must catch the "important" stuff.
- You rarely know in advance what will be in the logs when you
  need them the most. So the monitor should give you as much
  of the log entries as possible, not just those lines that
  match some pre-defined strings or regex'es.
- Some systems log messages on multiple lines. The system must
  be able to show all parts of a log entry.
- Logfile entries must appear on the monitor for some time after
  they show up in the logs, but should also disappear after a
  while.

In other words: The ideal solution would let you have the entire
logfile available on the Hobbit server - but that obviously
won't work. So the client should - after weeding out the really
irrelevant stuff - send us as much of each logfile as possible.

My proposed solution is this:
- On the Hobbit server, there's a log-monitoring configuration
  file for the Hobbit clients. This defines which logfiles are
  monitored for a single client installation, or you can define
  it for a group of clients. (The idea is to define at least
  one group for each operating system, since the standard
  system logs are OS dependant). This configuration lists the
  log filename, the maximum amount of data to send from this
  logfile, a regex "noise" filter (i.e. lines that are stripped
  from the logfile), and *optionally* a regex identifying really
  interesting stuff in the logfile that should always be
  reported.
- When a client connects to the Hobbit server and sends the
  normal client message, the Hobbit server will respond with
  the logfile configuration for this client. So the client
  has a copy of the central configuration file, but only the
  part that it needs for itself. The reason for sending this
  as a response to the client message is to avoid an extra
  round-trip from client to server; piggy-backing the config
  push on the client message means that it is almost without
  any performance cost on the server side.
- When the client runs, it uses the local copy of the configuration
  file to determine what logs to look at. For each logfile, it
  maintains a "where-was-I-the-last-time" status, so it only
  looks at the entries made to the logfile during the past 30
  minutes. First, the client strips off any "noise" messages.
  Then, if all of the entries fit into the maximum size that
  can be reported, it sends all of the log to the Hobbit server.
  If there is more than will fit, it first checks to see of the
  regex defining the really interesting stuff is present in the
  log - if it is, then it drops anything before the interesting
  text. If there is still more than will fit, it keeps the
  interesting text + a few lines after that (to allow for
  multi-line log-entries which some OS'es have), and then
  sends that together with as much of last part the log as will
  fit inside the max. message size.

This part has been implemented in the Hobbit daemon (hobbitd),
and in the clients via a new "logfetch" utility. This utility
uses standard regular expressions - not the Perl-compatible
ones, because that would require you to install the PCRE
library on all of your clients. The standard regex routines
are included in all (I think) system libraries used today.

The last part is what happens when the log data arrives on the
Hobbit server. Currently, there's a simple processing of this
data to just dump it into an always-green "msgs" column. What
should happen once I get it coded is:
- Data from each logfile is matched against a set of strings
  (regex'es) defined in the hobbit-clients.cfg file. Each string
  determines the color (red, yellow, green) and sets the color
  of the msgs column accordingly.

When the color has been decided, all of the normal alerting
happens automatically. I do plan on making a more fine-grained
alert mechanism (for the msgs, procs and disk statuses) so you
can direct alerts to different groups depending on exactly
which log-message triggered the alert, but that will not be
part of this release.


So - how does that sound ? Anything I've missed ?


Regards,
Henrik

-- 

Asif Iqbal
PGP Key: 0xE62693C5 KeyServer: pgp.mit.edu
"..there are two kinds of people: those who work and those who take the credit...try
 to be in the first group;...less competition there."  - Indira Gandhi
list Vernon Everett · Wed, 15 Feb 2006 13:59:22 +0800 ·
Hi all

Henrik, I have given this a bit of thought, and think it's great.
(I refer to your proposal here, not my capacity for thought.)

Would it be possible to add custom strings and status?

A perfect example would be this. (from /var/adm/messages)
---snip---
Feb 10 13:31:15 afgdev tldd[649]: [ID 138416 daemon.error] TLD(0) drive
2 (device 1) is being DOWNED, status: Unable to open drive
---snip---
Anywhere else, this would not be a major issue, but on my backup server
where my tape library is attached, this is a major red alert.

Regards
    Vernon
quoted from Asif Iqbal

 
-----Original Message-----
From: Henrik Stoerner [mailto:user-ce4a2c883f75@xymon.invalid] 
Sent: Wednesday, 15 February 2006 5:40 AM
To: user-ae9b8668bcde@xymon.invalid
Subject: [hobbit] Logfile monitoring - I'd like some comments

A few days ago, I mentioned that I would like to do logfile monitoring
for the next Hobbit release.

I've worked a bit on this and have a prototype solution for it, which
you can test with the current snapshots. I'd like some comments on how
it works to make sure I haven't overlooked something before committing
myself.

There are several objectives:
- As far as is possible, logfile monitoring must be configured
  centrally, on the Hobbit server. Having to go to each server
  to (re)configure what logfiles to check and what to look for
  simply doesn't work.
- The amount of data sent from each client to Hobbit should be
  small, but it must catch the "important" stuff.
- You rarely know in advance what will be in the logs when you
  need them the most. So the monitor should give you as much
  of the log entries as possible, not just those lines that
  match some pre-defined strings or regex'es.
- Some systems log messages on multiple lines. The system must
  be able to show all parts of a log entry.
- Logfile entries must appear on the monitor for some time after
  they show up in the logs, but should also disappear after a
  while.

In other words: The ideal solution would let you have the entire logfile
available on the Hobbit server - but that obviously won't work. So the
client should - after weeding out the really irrelevant stuff - send us
as much of each logfile as possible.

My proposed solution is this:
- On the Hobbit server, there's a log-monitoring configuration
  file for the Hobbit clients. This defines which logfiles are
  monitored for a single client installation, or you can define
  it for a group of clients. (The idea is to define at least
  one group for each operating system, since the standard
  system logs are OS dependant). This configuration lists the
  log filename, the maximum amount of data to send from this
  logfile, a regex "noise" filter (i.e. lines that are stripped
  from the logfile), and *optionally* a regex identifying really
  interesting stuff in the logfile that should always be
  reported.
- When a client connects to the Hobbit server and sends the
  normal client message, the Hobbit server will respond with
  the logfile configuration for this client. So the client
  has a copy of the central configuration file, but only the
  part that it needs for itself. The reason for sending this
  as a response to the client message is to avoid an extra
  round-trip from client to server; piggy-backing the config
  push on the client message means that it is almost without
  any performance cost on the server side.
- When the client runs, it uses the local copy of the configuration
  file to determine what logs to look at. For each logfile, it
  maintains a "where-was-I-the-last-time" status, so it only
  looks at the entries made to the logfile during the past 30
  minutes. First, the client strips off any "noise" messages.
  Then, if all of the entries fit into the maximum size that
  can be reported, it sends all of the log to the Hobbit server.
  If there is more than will fit, it first checks to see of the
  regex defining the really interesting stuff is present in the
  log - if it is, then it drops anything before the interesting
  text. If there is still more than will fit, it keeps the
  interesting text + a few lines after that (to allow for
  multi-line log-entries which some OS'es have), and then
  sends that together with as much of last part the log as will
  fit inside the max. message size.

This part has been implemented in the Hobbit daemon (hobbitd), and in
the clients via a new "logfetch" utility. This utility uses standard
regular expressions - not the Perl-compatible ones, because that would
require you to install the PCRE library on all of your clients. The
standard regex routines are included in all (I think) system libraries
used today.

The last part is what happens when the log data arrives on the Hobbit
server. Currently, there's a simple processing of this data to just dump
it into an always-green "msgs" column. What should happen once I get it
coded is:
- Data from each logfile is matched against a set of strings
  (regex'es) defined in the hobbit-clients.cfg file. Each string
  determines the color (red, yellow, green) and sets the color
  of the msgs column accordingly.

When the color has been decided, all of the normal alerting happens
automatically. I do plan on making a more fine-grained alert mechanism
(for the msgs, procs and disk statuses) so you can direct alerts to
different groups depending on exactly which log-message triggered the
alert, but that will not be part of this release.


So - how does that sound ? Anything I've missed ?


Regards,
Henrik


_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

NOTICE: This message and any attachments are confidential and may contain copyright material 
of Australian Finance Group Limited or a third party. It is intended solely for the purpose of the 
addressee and any other named recipient. If you are not the intended recipient, any use, 
distribution, disclosure or copying of this message is strictly prohibited. The confidentiality attached
to this message is not waived or lost by reason of the mistaken transmission or delivery to any 
unintended party. If you have received this message in error, please notify the author immediately or 
contact Australian Finance Group on +61 8 9420 7888.
list Henrik Størner · Wed, 15 Feb 2006 07:28:36 +0100 ·
quoted from Vernon Everett
On Wed, Feb 15, 2006 at 01:59:22PM +0800, Vernon Everett wrote:
Henrik, I have given this a bit of thought, and think it's great.
(I refer to your proposal here, not my capacity for thought.)

Would it be possible to add custom strings and status?

A perfect example would be this. (from /var/adm/messages)
---snip---
Feb 10 13:31:15 afgdev tldd[649]: [ID 138416 daemon.error] TLD(0) drive
2 (device 1) is being DOWNED, status: Unable to open drive
---snip---
Anywhere else, this would not be a major issue, but on my backup server
where my tape library is attached, this is a major red alert.

Of course you can do that. 

You should note that the config that gets pushed to the client is
merely a list of logfiles, and a rough filter to avoid sending
all of the log to the Hobbit server.

You still configure what messages trigger an alert in the
hobbit-client.cfg file on the Hobbit server - and in your case,
you would just set things up to trigger a critical alert when 
this message shows up in your backup server logfile.


Regards,
Henrik
list Rolf Schrittenlocher · Wed, 15 Feb 2006 07:30:11 +0100 ·
Hi Henrik,

that sounds wonderful, your ideas are pretty good. I have one suggestion 
and one point to think about.

For monitoring application-logs the config file soon would become very 
complex and likely unreadable. There it would be nice if there is a 
possibilty for include-files. So general definitions -e.g. for 
OS-related logs - would be made in the general config  file, all the 
individual stuff in <host>.config files.

Second, there are logs which change their names once the server or the 
application stops. So a "message.log" might become "message<date>.log" 
while there is a new message.log now. Still the last informations in 
"message<date>.log" are relevant, especially if the reason for the new 
log was a crash of the application. I don't know how to deal with this 
situation as there are multiple ways how logs might change their names 
but perhaps others have an idea of how to do that.

kind regards
Rolf
quoted from Vernon Everett
A few days ago, I mentioned that I would like to do logfile
monitoring for the next Hobbit release.

I've worked a bit on this and have a prototype solution for
it, which you can test with the current snapshots. I'd like
some comments on how it works to make sure I haven't overlooked
something before committing myself.

There are several objectives:
- As far as is possible, logfile monitoring must be configured
 centrally, on the Hobbit server. Having to go to each server
 to (re)configure what logfiles to check and what to look for
 simply doesn't work.
- The amount of data sent from each client to Hobbit should be
 small, but it must catch the "important" stuff.
- You rarely know in advance what will be in the logs when you
 need them the most. So the monitor should give you as much
 of the log entries as possible, not just those lines that
 match some pre-defined strings or regex'es.
- Some systems log messages on multiple lines. The system must
 be able to show all parts of a log entry.
- Logfile entries must appear on the monitor for some time after
 they show up in the logs, but should also disappear after a
 while.

In other words: The ideal solution would let you have the entire
logfile available on the Hobbit server - but that obviously 
won't work. So the client should - after weeding out the really 
irrelevant stuff - send us as much of each logfile as possible.

My proposed solution is this:
- On the Hobbit server, there's a log-monitoring configuration
 file for the Hobbit clients. This defines which logfiles are
 monitored for a single client installation, or you can define
 it for a group of clients. (The idea is to define at least 
 one group for each operating system, since the standard
 system logs are OS dependant). This configuration lists the
 log filename, the maximum amount of data to send from this
 logfile, a regex "noise" filter (i.e. lines that are stripped
 from the logfile), and *optionally* a regex identifying really
 interesting stuff in the logfile that should always be 
 reported.
- When a client connects to the Hobbit server and sends the
 normal client message, the Hobbit server will respond with
 the logfile configuration for this client. So the client
 has a copy of the central configuration file, but only the
 part that it needs for itself. The reason for sending this
 as a response to the client message is to avoid an extra
 round-trip from client to server; piggy-backing the config
 push on the client message means that it is almost without
 any performance cost on the server side.
- When the client runs, it uses the local copy of the configuration
 file to determine what logs to look at. For each logfile, it
 maintains a "where-was-I-the-last-time" status, so it only
 looks at the entries made to the logfile during the past 30
 minutes. First, the client strips off any "noise" messages.
 Then, if all of the entries fit into the maximum size that
 can be reported, it sends all of the log to the Hobbit server.
 If there is more than will fit, it first checks to see of the
 regex defining the really interesting stuff is present in the
 log - if it is, then it drops anything before the interesting
 text. If there is still more than will fit, it keeps the
 interesting text + a few lines after that (to allow for
 multi-line log-entries which some OS'es have), and then
 sends that together with as much of last part the log as will
 fit inside the max. message size.

This part has been implemented in the Hobbit daemon (hobbitd),
and in the clients via a new "logfetch" utility. This utility
uses standard regular expressions - not the Perl-compatible
ones, because that would require you to install the PCRE
library on all of your clients. The standard regex routines
are included in all (I think) system libraries used today.

The last part is what happens when the log data arrives on the
Hobbit server. Currently, there's a simple processing of this
data to just dump it into an always-green "msgs" column. What
should happen once I get it coded is:
- Data from each logfile is matched against a set of strings 
 (regex'es) defined in the hobbit-clients.cfg file. Each string 
 determines the color (red, yellow, green) and sets the color
 of the msgs column accordingly.

When the color has been decided, all of the normal alerting
happens automatically. I do plan on making a more fine-grained
alert mechanism (for the msgs, procs and disk statuses) so you
can direct alerts to different groups depending on exactly 
which log-message triggered the alert, but that will not be
part of this release.


So - how does that sound ? Anything I've missed ?


Regards,
Henrik

-- 

Mit freundlichen Gruessen
Rolf Schrittenlocher

HRZ/BDV, Senckenberganlage 31, 60054 Frankfurt 
Tel: (XX) XX - XXX XXXXX   Fax: (XX) XX XXX XXXX
LBS: user-1e39a1813094@xymon.invalid
Persoenlich: user-6ea8e907e200@xymon.invalid
list Henrik Størner · Wed, 15 Feb 2006 07:39:11 +0100 ·
quoted from Larry Barber
On Tue, Feb 14, 2006 at 04:56:17PM -0600, Larry Barber wrote:
The feature of the client retrieving its configuration from the server might
cause some problems with firewalls, and security types might not be willing
to open firewalls for such infrequent messages.
Infrequent? It's part of the client sending its status update, so it
happens every 5 minutes.

Since this is just an extension of the protocol that is already being
used for sending statuses to the Hobbit server, you won't need any
additional firewall openings.

And it's not like you can use it for any kind of file transfer. You'll
have to get the data into the Hobbit server first, so security on that
server is obviously important (but if you're care about security, you
should really care about the security of your monitoring server in the
first place). What's sent to the client can only be a part of the 
hobbit servers' log-configuration.

I guess you shouldn't tell your firewall admin's about the "config"
request you can send through the "bb" utility ....
quoted from Asif Iqbal
I like the idea of having
centralized configuration, but it would be nice if it could be implemented
without upsetting the firewall admins. It also eliminates the pure "push" of
the BigBrother client, which is also something security types like. It might
be a good idea to have an option for local client configuration.
It is possible to run the client using a local configuration. You'll
need to install the PCRE libraries on your clients for that, but it can
be done.


Regards,
Henrik
list Charles Jones · Tue, 14 Feb 2006 23:43:20 -0700 ·
Henrik,

How will it handle monitoring files that get rotated out?  For example if the hobbit client is monitoring /var/log/messages, and a cron rotate script moves messages to messages.1 and gzips it, will the hobbit client be smart enough to reseek to the end of the newly created file?

Some log rotation setups move/rename the file to another (which keeps the inode), and then recreate a new file with the same name as the origional., Some copy the file to a new file and truncate the old one, and other variations.

*** Partially off-topic ***
While looking at another groups monitoring setup, they were using a program called ****** (name doesnt matter), which I found to be inferior to Hobbit, but it did have one nice feature, which was the ability to test the checksum of a list of files, and send an alert if the file changed (default examples were /etc/passwd, /vmlinuz, /etc/syslog.conf).  I suppose this functionality could be achieved via a client-side external script, but I mention it here because it might be easy to add in now while you are working on the file scanning code :)

-Charles
list Henrik Størner · Wed, 15 Feb 2006 07:50:15 +0100 ·
Hi Rolf,
quoted from Rolf Schrittenlocher

On Wed, Feb 15, 2006 at 07:30:11AM +0100, Rolf Schrittenlocher wrote:
Hi Henrik,

that sounds wonderful, your ideas are pretty good. 
Thanks.
quoted from Rolf Schrittenlocher
I have one suggestion and one point to think about.

For monitoring application-logs the config file soon would become very complex and likely unreadable. There it would be nice if there is a possibilty for include-files. So general definitions -e.g. for OS-related logs - would be made in the general config  file, all the individual stuff in <host>.config files.
That would make sense, I agree.
Second, there are logs which change their names once the server or the application stops. So a "message.log" might become "message<date>.log" while there is a new message.log now. Still the last informations in "message<date>.log" are relevant, especially if the reason for the new log was a crash of the application. I don't know how to deal with this situation as there are multiple ways how logs might change their names but perhaps others have an idea of how to do that.
It's something I thought about too, although my concerns were with logfiles that get rotated at a specific time of day, not those that change when an application restarts. But like yourself I couldn't find a really good solution.

The file being renamed is not the only problem - it may get compressed,
moved to a different directory, ... lots of ways for it to get lost.

So - if anyone has a bright idea, do speak up.


Regards,
Henrik
list Henrik Størner · Wed, 15 Feb 2006 07:56:20 +0100 ·
quoted from Charles Jones
On Tue, Feb 14, 2006 at 11:43:20PM -0700, Charles Jones wrote:
How will it handle monitoring files that get rotated out?  For example if the hobbit client is monitoring /var/log/messages, and a cron rotate script moves messages to messages.1 and gzips it, will the hobbit client be smart enough to reseek to the end of the newly created file?
Log rotation is difficult to handle - I just wrote about it in another
reply. In the scenario you describe, Hobbit would miss those log
messages that were made between the last client run and the log
rotation - so normally, that would only be log-entries for a few minutes
(since the client runs every 5 minutes).

Hobbit does notice that the log was rotated, and starts sending the
entries that go into the new logfile.
quoted from Charles Jones
*** Partially off-topic ***
While looking at another groups monitoring setup, they were using a program called ****** (name doesnt matter), which I found to be inferior to Hobbit, but it did have one nice feature, which was the ability to test the checksum of a list of files, and send an alert if the file changed (default examples were /etc/passwd, /vmlinuz, /etc/syslog.conf).  I suppose this functionality could be achieved via a client-side external script, but I mention it here because it might be easy to add in now while you are working on the file scanning code :)
I think this is better handled by some of the host-based IDS systems
that are out there - like Tripwire, or the open-source equivalent AIDE.
That's what they are designed to do, and they have much more advanced
techniques of checking that the file contents doesn't change (multiple
hashes, checking of file meta-data etc.)


Regards,
Henrik
list Thomas Pedersen · Wed, 15 Feb 2006 09:23:22 +0100 ·
Hi Henrik,

So on the central server there will be 2 configuration files. One for 
the log retrival defining interesting items ( I guess this is what today 
is yellow and red strings) and then a hobbit-client configuration file 
where you define the stings again ? I am not clear on why you would want 
to seperate files with some of the same information in. I get the idea 
of a log retrival configuration file with the log file names (both OS 
dependant and local files) but why not have it all in 1 file. Then you 
could have a LOGFILE-CLASS definition to put on a host group or single 
host in the hobbit-clients file.

Just a thought.

Will this new logfile retrival also be able to look for logfiles with 
variable file names, ie. logfile.txt-20060215 for today and then a new 
filename logfile.txt-20060216 tomorrow ? I know its stupid but that's 
how the vendor creates it.

Best regards,
Thomas
quoted from Rolf Schrittenlocher

Henrik Stoerner wrote:
A few days ago, I mentioned that I would like to do logfile
monitoring for the next Hobbit release.

I've worked a bit on this and have a prototype solution for
it, which you can test with the current snapshots. I'd like
some comments on how it works to make sure I haven't overlooked
something before committing myself.

There are several objectives:
- As far as is possible, logfile monitoring must be configured
  centrally, on the Hobbit server. Having to go to each server
  to (re)configure what logfiles to check and what to look for
  simply doesn't work.
- The amount of data sent from each client to Hobbit should be
  small, but it must catch the "important" stuff.
- You rarely know in advance what will be in the logs when you
  need them the most. So the monitor should give you as much
  of the log entries as possible, not just those lines that
  match some pre-defined strings or regex'es.
- Some systems log messages on multiple lines. The system must
  be able to show all parts of a log entry.
- Logfile entries must appear on the monitor for some time after
  they show up in the logs, but should also disappear after a
  while.

In other words: The ideal solution would let you have the entire
logfile available on the Hobbit server - but that obviously 
won't work. So the client should - after weeding out the really 
irrelevant stuff - send us as much of each logfile as possible.

My proposed solution is this:
- On the Hobbit server, there's a log-monitoring configuration
  file for the Hobbit clients. This defines which logfiles are
  monitored for a single client installation, or you can define
  it for a group of clients. (The idea is to define at least 
  one group for each operating system, since the standard
  system logs are OS dependant). This configuration lists the
  log filename, the maximum amount of data to send from this
  logfile, a regex "noise" filter (i.e. lines that are stripped
  from the logfile), and *optionally* a regex identifying really
  interesting stuff in the logfile that should always be 
  reported.
- When a client connects to the Hobbit server and sends the
  normal client message, the Hobbit server will respond with
  the logfile configuration for this client. So the client
  has a copy of the central configuration file, but only the
  part that it needs for itself. The reason for sending this
  as a response to the client message is to avoid an extra
  round-trip from client to server; piggy-backing the config
  push on the client message means that it is almost without
  any performance cost on the server side.
- When the client runs, it uses the local copy of the configuration
  file to determine what logs to look at. For each logfile, it
  maintains a "where-was-I-the-last-time" status, so it only
  looks at the entries made to the logfile during the past 30
  minutes. First, the client strips off any "noise" messages.
  Then, if all of the entries fit into the maximum size that
  can be reported, it sends all of the log to the Hobbit server.
  If there is more than will fit, it first checks to see of the
  regex defining the really interesting stuff is present in the
  log - if it is, then it drops anything before the interesting
  text. If there is still more than will fit, it keeps the
  interesting text + a few lines after that (to allow for
  multi-line log-entries which some OS'es have), and then
  sends that together with as much of last part the log as will
  fit inside the max. message size.

This part has been implemented in the Hobbit daemon (hobbitd),
and in the clients via a new "logfetch" utility. This utility
uses standard regular expressions - not the Perl-compatible
ones, because that would require you to install the PCRE
library on all of your clients. The standard regex routines
are included in all (I think) system libraries used today.

The last part is what happens when the log data arrives on the
Hobbit server. Currently, there's a simple processing of this
data to just dump it into an always-green "msgs" column. What
should happen once I get it coded is:
- Data from each logfile is matched against a set of strings 
  (regex'es) defined in the hobbit-clients.cfg file. Each string 
  determines the color (red, yellow, green) and sets the color
  of the msgs column accordingly.

When the color has been decided, all of the normal alerting
happens automatically. I do plan on making a more fine-grained
alert mechanism (for the msgs, procs and disk statuses) so you
can direct alerts to different groups depending on exactly 
which log-message triggered the alert, but that will not be
part of this release.


So - how does that sound ? Anything I've missed ?


Regards,
Henrik

list Henrik Størner · Wed, 15 Feb 2006 13:08:04 +0100 ·
quoted from Thomas Pedersen
On Wed, Feb 15, 2006 at 09:23:22AM +0100, Thomas wrote:
Hi Henrik,

So on the central server there will be 2 configuration files. One for 
the log retrival defining interesting items ( I guess this is what today 
is yellow and red strings) and then a hobbit-client configuration file 
where you define the stings again ? I am not clear on why you would want 
to seperate files with some of the same information in. 
No, you wouldn't define the same strings in both files - that would be
silly. You define the strings that can trigger a red or yellow status in
the hobbit-clients config - that's all. 

What you *can* put into the other config are some hints about how to minimize 
the amount of log data that Hobbit needs to process. So you can setup a
regexp of stuff in the logfile that you *never* want to see, and a
regexp of stuff that you *always* want to report - regardless of how
much the log grows. The last one may be identical to some of what you
have in the hobbit-clients config, but it could be different - or you
could go without any definition in the second file at all.

An example: you're monitoring an application that logs some data to a
logfile, and that you've set a limit on the amount of data you want
of 200 bytes (that is probably too small for anything, but just for
this example). You know that the application crashes occasionally,
but it usually recovers automatically - so you've just configured
the hobbit-clients.cfg file to send a warning for the application
"Startup complete" message, and an alert for "Startup failed" or 
"Error".

The log now looks like this:

  10:41:03 myapp: Startup complete
  10:41:03 myapp: -- MARK --
  10:44:03 myapp: -- MARK --
  10:47:03 myapp: -- MARK --
  10:48:32 myapp: Error reading data, retrying
  10:49:19 myapp: Error reading data, retrying
  10:49:20 myapp: Error reading data, retrying
  10:49:21 myapp: Error reading data, retrying
  10:49:22 myapp: Error reading data, retrying
  10:49:23 myapp: Error reading data, retrying
  10:49:24 myapp: Error reading data, retrying
  10:49:37 myapp: Unhandled exception at myapp_service.c:312: I/O error
  10:49:37 myapp: Instruction dump follows:
  0000000 030460 027060 070155 005147 060504 064556 060543 042012
  0000020 071545 072153 070157 042012 060551 071147 066541 027061
  0000040 064544 005141 067504 072543 062555 072156 005163 041105
  0000060 045512 027123 061145 065552 074545 072163 071157 005145
  0000100 052110 046115 052137 050157 043104 031455 031456 072056
  0000120 071141 063456 005172 060515 066151 046412 064541 062154
  0000140 071151 046412 075157 071141 057564 074563 063155 032137
  0000160 057460 064550 064147 066456 031560 046412 071565 065551
  0000200 047012 073545 005163 064520 071543 005062 051522 067171
  0000220 041143 061541 072553 005160 051523 005114 042530 064160
  0000240 066545 060412 061141 060412 071544 005154 063141 163154
  0000260 027163 074164 005164 067141 064564 064566 072562 005163
  0000300 071141 064143 073151 005145 072541 067564 060563 062566
  0000320 060412 064170 066557 005145 075141 071165 072545 005163
  0000340 033142 005064 033142 027064 005143 060542 065543 070165
  0000360 061012 071541 061551 057563 071550 067167 066056 064544
  0000400 005146 060542 064563 071543 071537 061165 062556 027164
  0000420 062154 063151 061012 026542 067550 072163 005163 061142
  10:49:37 myapp: Initiating recovery restart procedure
  10:49:38 myapp: Startup complete

The "-- MARK --" lines are just noise - they just tell os the 
application is running. So you put those into the "ignore" regexp 
that is pushed to the client, and the client will filter out those 
lines before reporting data to Hobbit.

Hobbit would normally report the last 200 bytes of the logfile.
But in this case, that would only include the dump data and the
"Startup complete" - so you would miss both the fact that the dump 
was due to an unhandled exception, and the fact that it may have 
been triggered by a disk error which causes the application to
retry I/O operations several times. And the "msgs" status would
be yellow.

To catch that, you can tell the Hobbit client to always include
certain log entries in the data it sends - e.g. here you could 
configure it to always include lines containing the word "Error".
quoted from Thomas Pedersen

Will this new logfile retrival also be able to look for logfiles with 
variable file names, ie. logfile.txt-20060215 for today and then a new 
filename logfile.txt-20060216 tomorrow ? I know its stupid but that's 
how the vendor creates it.
That's one variant I haven't seen yet. It would be tricky to implement;
couldn't you just run something like this on the client daily via cron:

   cd /var/log/myapp
   CURRENTLOG=`ls -t logfile.txt-* | head -1`
   ln -s $CURRENTLOG logfile.txt

and then Hobbit can look at logfile.txt ?


Regards,
Henrik
list Larry Sherman · Wed, 15 Feb 2006 09:34:57 -0500 ·
I have one concern about the amount of logfile data being sent back to
the server.  Our application logfiles our VERY verbose for compliance
reasons, but on a daily basis I don't need 99% of the information.  I
would rather have the ability to have the client side agent use the
regex rule to define what part of the file to send back to the server.

IE:

#Machine name
Gsets001
# LODFILE define
$LOG_DIR/some_verbose_app_logfile.log
#Rules for logfile define (string match and contex)
#COLOR	STRING	NUMLINESABOVE NUMLINESBELOW	
RED_STRING	/FH DOWN/	3              4
RED_STRING	/BAD Trade/ 10             1
RED_STRING	/forced log off from server/	0	0
$OTHER_LOG_DIR/another_verbose_app_logfile.log
RED_STRING	/Slow Consumer/	2	    6

Does this make sense to you?

Larry
quoted from Thomas Pedersen


-----Original Message-----
From: Henrik Stoerner [mailto:user-ce4a2c883f75@xymon.invalid] 
Sent: Tuesday, February 14, 2006 4:40 PM
To: user-ae9b8668bcde@xymon.invalid
Subject: [hobbit] Logfile monitoring - I'd like some comments

A few days ago, I mentioned that I would like to do logfile
monitoring for the next Hobbit release.

I've worked a bit on this and have a prototype solution for
it, which you can test with the current snapshots. I'd like
some comments on how it works to make sure I haven't overlooked
something before committing myself.

There are several objectives:
- As far as is possible, logfile monitoring must be configured
  centrally, on the Hobbit server. Having to go to each server
  to (re)configure what logfiles to check and what to look for
  simply doesn't work.
- The amount of data sent from each client to Hobbit should be
  small, but it must catch the "important" stuff.
- You rarely know in advance what will be in the logs when you
  need them the most. So the monitor should give you as much
  of the log entries as possible, not just those lines that
  match some pre-defined strings or regex'es.
- Some systems log messages on multiple lines. The system must
  be able to show all parts of a log entry.
- Logfile entries must appear on the monitor for some time after
  they show up in the logs, but should also disappear after a
  while.

In other words: The ideal solution would let you have the entire
logfile available on the Hobbit server - but that obviously 
won't work. So the client should - after weeding out the really 
irrelevant stuff - send us as much of each logfile as possible.

My proposed solution is this:
- On the Hobbit server, there's a log-monitoring configuration
  file for the Hobbit clients. This defines which logfiles are
  monitored for a single client installation, or you can define
  it for a group of clients. (The idea is to define at least 
  one group for each operating system, since the standard
  system logs are OS dependant). This configuration lists the
  log filename, the maximum amount of data to send from this
  logfile, a regex "noise" filter (i.e. lines that are stripped
  from the logfile), and *optionally* a regex identifying really
  interesting stuff in the logfile that should always be 
  reported.
- When a client connects to the Hobbit server and sends the
  normal client message, the Hobbit server will respond with
  the logfile configuration for this client. So the client
  has a copy of the central configuration file, but only the
  part that it needs for itself. The reason for sending this
  as a response to the client message is to avoid an extra
  round-trip from client to server; piggy-backing the config
  push on the client message means that it is almost without
  any performance cost on the server side.
- When the client runs, it uses the local copy of the configuration
  file to determine what logs to look at. For each logfile, it
  maintains a "where-was-I-the-last-time" status, so it only
  looks at the entries made to the logfile during the past 30
  minutes. First, the client strips off any "noise" messages.
  Then, if all of the entries fit into the maximum size that
  can be reported, it sends all of the log to the Hobbit server.
  If there is more than will fit, it first checks to see of the
  regex defining the really interesting stuff is present in the
  log - if it is, then it drops anything before the interesting
  text. If there is still more than will fit, it keeps the
  interesting text + a few lines after that (to allow for
  multi-line log-entries which some OS'es have), and then
  sends that together with as much of last part the log as will
  fit inside the max. message size.

This part has been implemented in the Hobbit daemon (hobbitd),
and in the clients via a new "logfetch" utility. This utility
uses standard regular expressions - not the Perl-compatible
ones, because that would require you to install the PCRE
library on all of your clients. The standard regex routines
are included in all (I think) system libraries used today.

The last part is what happens when the log data arrives on the
Hobbit server. Currently, there's a simple processing of this
data to just dump it into an always-green "msgs" column. What
should happen once I get it coded is:
- Data from each logfile is matched against a set of strings 
  (regex'es) defined in the hobbit-clients.cfg file. Each string 
  determines the color (red, yellow, green) and sets the color
  of the msgs column accordingly.

When the color has been decided, all of the normal alerting
happens automatically. I do plan on making a more fine-grained
alert mechanism (for the msgs, procs and disk statuses) so you
can direct alerts to different groups depending on exactly 
which log-message triggered the alert, but that will not be
part of this release.


So - how does that sound ? Anything I've missed ?


Regards,
Henrik


*******************************************************************

• This e-mail is intended only for the addressee named above.
As this e-mail may contain confidential or privileged information,
if you are not the named addressee, you are not authorized
to retain, read, copy or disseminate this message or any part of
it.

*******************************************************************
*
list Asif Iqbal · Wed, 15 Feb 2006 11:11:52 -0500 ·
quoted from Henrik Størner
On Wed, Feb 15, 2006 at 07:50:15AM, Henrik Storner wrote:
Hi Rolf,

On Wed, Feb 15, 2006 at 07:30:11AM +0100, Rolf Schrittenlocher wrote:
Hi Henrik,
that sounds wonderful, your ideas are pretty good. 
Thanks.
I have one suggestion and one point to think about.
For monitoring application-logs the config file soon would become very > complex and likely unreadable. There it would be nice if there is a > possibilty for include-files. So general definitions -e.g. for > OS-related logs - would be made in the general config  file, all the > individual stuff in <host>.config files.
That would make sense, I agree.
Second, there are logs which change their names once the server or the > application stops. So a "message.log" might become "message<date>.log" > while there is a new message.log now. Still the last informations in > "message<date>.log" are relevant, especially if the reason for the new > log was a crash of the application. I don't know how to deal with this > situation as there are multiple ways how logs might change their names > but perhaps others have an idea of how to do that.
It's something I thought about too, although my concerns were with logfiles that get rotated at a specific time of day, not those that change when an application restarts. But like yourself I couldn't find a really good solution.
We have similar problem where we actually need to monitor
messsges.`date` file. With bb-msgtab you cannot do that. I wonder if
hobbit could be able monitor a file with dynamic extension like that.
quoted from Henrik Størner
The file being renamed is not the only problem - it may get compressed,
moved to a different directory, ... lots of ways for it to get lost.

So - if anyone has a bright idea, do speak up.
Currently we have created symbolic files like these 
messages.020106 --> messages
messages.020206 --> messages
messages.022806 --> messages

So when the application tries to put files on the specific date file it
does not matter since bb-msgtab always reading the `messages' file which
never changed.

Thanks
quoted from Asif Iqbal
Regards,
Henrik

-- 
Asif Iqbal
PGP Key: 0xE62693C5 KeyServer: pgp.mit.edu
"..there are two kinds of people: those who work and those who take the credit...try
 to be in the first group;...less competition there."  - Indira Gandhi
list Rob Munsch · Wed, 15 Feb 2006 17:22:10 -0500 ·
quoted from Henrik Størner
Henrik Stoerner wrote:
the amount of log data that Hobbit needs to process. So you can setup a
regexp of stuff in the logfile that you *never* want to see, and a
regexp of stuff that you *always* want to report - regardless of how
much the log grows. 
Well, that covers my comment.  I'd much rather give a list of "this stuff is always Good" than try to cover every instance of Bad, so that's awesome.

It would be ideal if the central config was somewhat bb-hosts-ish, and could accept (in addition to aforementioned includes) host-specific directives for what to log.  I am assuming that how to react will already be host-specific like every other test, yes?

As far as the logs rotating out... couldn't hobbit look for an environment variable for the format of the rotated logs...?  The filenames vary host to host, but on a given host, a quick look at /var/log tells you what to expect, right? 
Lastly - being someone who couldn't program his way out of a paper stack, i will now cheekily suggest that on install, hobbit could look at /var/log and guesstimate the format, and ask for human confirmation (as it already does for the hobbit user and homedir).

Even if this automated | dream doesn't happen, could it still be set manually or via config?

-- 
Rob Munsch
Solutions For Progress IT
list Greg L Hubbard · Wed, 15 Feb 2006 16:39:39 -0600 ·
My two cents/pence/francs/pesetas/whatever (I am overcharging):

I think it would be cool if the Hobbit client could watch arbitrary log
files for arbitrary messages and turn that into status alarms.  But
don't assume that /var/log or /var/adm or /var/adm is accessible to
ordinary (as in Hobbit client) users -- not around here, anyway.
Besides, I already have another solution for managing my UNIX syslogs --
what I don't have is a way to manage all of my application log files.

Wonder how it would work if the client somehow retrieved "orders" from
the BB server at startup and this was used to drive  a client-side
scanner?  I like the notion of centralized configuration, but...

Otherwise, you might as well forward all the logs to the central server
(syslog-ng?) and have the server parse them.  But this wouldn't work for
the logs that I want to root through with the clients.

GLH 
quoted from Rob Munsch

-----Original Message-----
From: Rob Munsch [mailto:user-f39e4aae1456@xymon.invalid] 
Sent: Wednesday, February 15, 2006 4:22 PM
To: user-ae9b8668bcde@xymon.invalid
Subject: Re: [hobbit] Logfile monitoring - I'd like some comments

Henrik Stoerner wrote:
the amount of log data that Hobbit needs to process. So you can setup a
regexp of stuff in the logfile that you *never* want to see, and a 
regexp of stuff that you *always* want to report - regardless of how 
much the log grows.
Well, that covers my comment.  I'd much rather give a list of "this
stuff is always Good" than try to cover every instance of Bad, so that's
awesome.

It would be ideal if the central config was somewhat bb-hosts-ish, and
could accept (in addition to aforementioned includes) host-specific
directives for what to log.  I am assuming that how to react will
already be host-specific like every other test, yes?

As far as the logs rotating out... couldn't hobbit look for an
environment variable for the format of the rotated logs...?  The
filenames vary host to host, but on a given host, a quick look at
/var/log tells you what to expect, right? 

Lastly - being someone who couldn't program his way out of a paper
stack, i will now cheekily suggest that on install, hobbit could look at
/var/log and guesstimate the format, and ask for human confirmation (as
it already does for the hobbit user and homedir).

Even if this automated | dream doesn't happen, could it still be set
manually or via config?

--
Rob Munsch
Solutions For Progress IT
list Mark Deiss · Thu, 16 Feb 2006 02:43:54 -0600 ·
Suggest that the framework be able to support a client side model in
addition to the server side model being contemplated (i.e. a bulk of the
handling being done on the hobbit collector server). It is fine to talk
about some initial client side processing and passing assorted areas of
interest from system and application log files for the real work to be done
on the collector. The point about administrating multiple clients from a
central collector with rules is well taken. (Aside: why not use swatch as
much as possible?). 

I think client-side processing will be important on AIX and Tru64 servers
which maintain system error logs that are in a binary format. You need the
platform tools to effectively parse the output. Passing this out in date
ranges for later, further processing on a hobbit collector will result in a
maintenance support issue. You need some flexibility in how much to pass on
which I do not see being practical on a collector-side model. The AIX and
Tru64 crowd will still be spending time on the clients making client-side
rule decisions on their regular expressions on how much to relay to the
collector for a particular event. Getting a valid set of information out of
a particular event is not always a simple, predictable matter.  This area
has been addressed with external modules developed on deadcat using
different approaches. It would appear sites using AIX and Tru64 will be
starting from scratch with the Hobbit model and will need to develop some
new tools to fit within the Hobbit collector side model. Gets worse for
Tru64 with the newer versions that require parsing tool sets that even HP
support is not happy with. I am amused per the one HP list recommendation
that you do not parse the ~5.1+ binary error logs - for a failure, you ship
the logs to HP support and let them struggle with it. (sigh....I have
resorted to scanning the newer logs using the older tools such as dia out of
desperation)

At least initially, the easist approach would be for the Hobbit collector
side approach to follow the stated aim regarding the ascii text files,
application and OS. The AIX and Tru64 sites may want to run one of the
deadcat variants as a client side only model and tweak it as a totally
separate client test. 

For the linux client base, why not configure logwatch to parse your
application/system logs and pump them over into the hobbit collector for the
additional processing? This product is capable of handling log files with
various date names. I "think".... it may also be able to handle compressed
files.
quoted from Greg L Hubbard


-----Original Message-----
From: Hubbard, Greg L [mailto:user-d970b5e56ec9@xymon.invalid]
Sent: Wednesday, February 15, 2006 5:40 PM
To: user-ae9b8668bcde@xymon.invalid
Subject: RE: [hobbit] Logfile monitoring - I'd like some comments


My two cents/pence/francs/pesetas/whatever (I am overcharging):

I think it would be cool if the Hobbit client could watch arbitrary log
files for arbitrary messages and turn that into status alarms.  But
don't assume that /var/log or /var/adm or /var/adm is accessible to
ordinary (as in Hobbit client) users -- not around here, anyway.
Besides, I already have another solution for managing my UNIX syslogs --
what I don't have is a way to manage all of my application log files.

Wonder how it would work if the client somehow retrieved "orders" from
the BB server at startup and this was used to drive  a client-side
scanner?  I like the notion of centralized configuration, but...

Otherwise, you might as well forward all the logs to the central server
(syslog-ng?) and have the server parse them.  But this wouldn't work for
the logs that I want to root through with the clients.

GLH 

-----Original Message-----
From: Rob Munsch [mailto:user-f39e4aae1456@xymon.invalid] 
Sent: Wednesday, February 15, 2006 4:22 PM
To: user-ae9b8668bcde@xymon.invalid
Subject: Re: [hobbit] Logfile monitoring - I'd like some comments

Henrik Stoerner wrote:
the amount of log data that Hobbit needs to process. So you can setup a
regexp of stuff in the logfile that you *never* want to see, and a 
regexp of stuff that you *always* want to report - regardless of how 
much the log grows.
Well, that covers my comment.  I'd much rather give a list of "this
stuff is always Good" than try to cover every instance of Bad, so that's
awesome.

It would be ideal if the central config was somewhat bb-hosts-ish, and
could accept (in addition to aforementioned includes) host-specific
directives for what to log.  I am assuming that how to react will
already be host-specific like every other test, yes?

As far as the logs rotating out... couldn't hobbit look for an
environment variable for the format of the rotated logs...?  The
filenames vary host to host, but on a given host, a quick look at
/var/log tells you what to expect, right? 

Lastly - being someone who couldn't program his way out of a paper
stack, i will now cheekily suggest that on install, hobbit could look at
/var/log and guesstimate the format, and ask for human confirmation (as
it already does for the hobbit user and homedir).

Even if this automated | dream doesn't happen, could it still be set
manually or via config?

--
Rob Munsch
Solutions For Progress IT
list Rolf Schrittenlocher · Thu, 16 Feb 2006 11:27:35 +0100 ·
Hi,

about once a day the hobbit client running on the machine which is hobbit server as well reports: "program crashed" but isn't dying.
clientdata.log shows:
 Worker process died with exit code 134, terminating
hobbitlaunch.log shows:
Task clientdata terminated, status 1

We have a special setup with two instances of hobbitclientlaunch running on the same machine, one reporting as MACHINE=$hostname, the other as MACHINE=<alias of $hostname>. Both clientlaunch-processes keep running without visible problems so I suppose the reports of the two instances trigger some error message without a really existing problem.
quoted from Rolf Schrittenlocher

-- 
Mit freundlichen Gruessen
Rolf Schrittenlocher

HRZ/BDV, Senckenberganlage 31, 60054 Frankfurt Tel: (XX) XX - XXX XXXXX   Fax: (XX) XX XXX XXXX
LBS: user-1e39a1813094@xymon.invalid
Persoenlich: user-6ea8e907e200@xymon.invalid
list Werner Michels · Thu, 16 Feb 2006 10:16:02 -0200 ·
On Wed, 15 Feb 2006 13:08:04 +0100
quoted from Henrik Størner
user-ce4a2c883f75@xymon.invalid (Henrik Stoerner) wrote:
Will this new logfile retrival also be able to look for logfiles with 
variable file names, ie. logfile.txt-20060215 for today and then a new 
filename logfile.txt-20060216 tomorrow ? I know its stupid but that's 
how the vendor creates it.
That's one variant I haven't seen yet. It would be tricky to implement;
couldn't you just run something like this on the client daily via cron:

   cd /var/log/myapp
   CURRENTLOG=`ls -t logfile.txt-* | head -1`
   ln -s $CURRENTLOG logfile.txt

and then Hobbit can look at logfile.txt ?
	Hi Henrik,

	At first, a very big HURRRRRAYY to Henrik and its quick growing up
**hobbit**. And thanks for your time and dedication.

	A suggestion for this tricky situation (and maybe it could help
in two or tree other situation) would be support something like *date
expansion* on the logfile variable. This could be done in two posible
ways.

	1) logfile = "/var/log/some_log.txt-%Y%m%d"

	2) logfile - "/var/log/some_log.txt-$(+%Y%m%d --date '1 day')"

	The first version could be build in, and use the standart time,
strftime ... functions.

	The seccond could be used by these people who have gnu date on
its client systems, and hobbit client could expand the pattern using the
date utility. I know, spawn/run external programs is not always so
easy/secure, but may be an option.

	A variation of the seccond option (much less secure, and much
more risk) could be support some kind of shell expansion, and sou accept
one line shell scripts to get the correct name of the logfile. For
example in the above case.

logfile = "/var/log/$SHELL(ls -t /var/log/app/my_app_log.txt.* |head -1)"

	Anyway, these are only some thinkings... may be they can be
helpfull for something in the future.

	Regards
	Werner
list Rolf Schrittenlocher · Thu, 16 Feb 2006 15:04:24 +0100 ·
Hi,

just got the core file for the following one. gdb says:

(gdb) bt
#0  0xff1a01a0 in _libc_kill () from /usr/lib/libc.so.1
#1  0xff136ce0 in abort () from /usr/lib/libc.so.1
#2  0x0001e81c in setup_signalhandler (
     programname=0xb <Address 0xb out of bounds>) at sig.c:57

we are running hobbit-4.1.2p1 on sun solaris 9. Is this due to our 2 
instances running? But as they have different PIDs that shouldn't be a 
problem I assume. Any suggestions how to debug that?

kind regards
Rolf


my mail from this morning:
quoted from Rolf Schrittenlocher

about once a day the hobbit client running on the machine which is
hobbit server as well reports: "program crashed" but isn't dying.
clientdata.log shows:
Worker process died with exit code 134, terminating
hobbitlaunch.log shows:
Task clientdata terminated, status 1

We have a special setup with two instances of hobbitclientlaunch running
on the same machine, one reporting as MACHINE=$hostname, the other as
MACHINE=<alias of $hostname>. Both clientlaunch-processes keep running
without visible problems so I suppose the reports of the two instances
trigger some error message without a really existing problem.

-- 
Mit freundlichen Gruessen
Rolf Schrittenlocher

HRZ/BDV, Senckenberganlage 31, 60054 Frankfurt
Tel: (XX) XX - XXX XXXXX   Fax: (XX) XX XXX XXXX
LBS: user-1e39a1813094@xymon.invalid
Persoenlich: user-6ea8e907e200@xymon.invalid
list David Gore · Thu, 16 Feb 2006 22:45:08 +0000 ·
Perhaps this is possible?  What we would like to have is a way to tie a specific log file alert with some text on what to do about the error.

In other words, if you caught something like 'FATAL - something just broke down on proc AE56F', obviously you would see this on the Hobbit web page, but it would be nice if we could include some text on how to fix it or at least identify to the end-user why we care about what might appear to them as a very cryptic log file entry.

Does that make sense?  Optionally add how to fix, or explanatory text for each log file entry that you alarm on?

~David
list Etienne Grignon · Fri, 24 Feb 2006 14:06:28 +0100 ·
Henrik,

Well, may be, we could look at logcheck project. http://logcheck.org .
I installed it once and the idea was nice.
Every log message was considered as alerts until you create the regexp to
ignore it.
So, of course, the first days, we would get a lot of alerts on msgs until
the database has all the common regular expression. It would be called the
"learning time". The nice thing is : if one day, new unknown messages is
sent by a client, we are sure to get an alert until we add it to the regexp
database.


So, the knowledge database could of course contain include to be able to
have some special regulars databases depending the os, the group, the host
or the application type to be able to organize clearly the regexp database.
All regexp entries in the database would include the alert type and help
notes  to understand alerts as  you all said.


To get configuration from the hobbit server, I think the actual protocol
would may be need an extra word  :


The actual config message is sent from the client to the hobbit server with
only one argument the filename :

Config <filename>


I think for the future, it will be easyer if you implement config message
like this :

Config <filename> <hostname>


 (sorry for the bad English)

--
Etienne