Xymon Mailing List Archive search

ext scripts & the Hobbit methodology for implementation.

4 messages in this thread

list S Aiello · Thu, 24 Jan 2008 10:24:24 -0500 ·
Within the last few months I have successfully migrated from BigBrother to Hobbit. The process was long and auderous since we had a large number of customizations in BigBrother. But now that we are on Hobbit, I really want to make use of the power of Hobbit and move away from the BB legacy method for ext scripts.

What I mean by BB legacy ext scripts, are client based ext scripts that send reports in via the bb status command. The Hobbit method is more server centric. In that clients report raw data to the server and the server processes that data, applying thresholds, formating reports, and possibly trending. Now to send raw data from the client there are two choices, data or client. Now in the past, I believe I saw a thread where Henrik mentioned that data was the preferred method for ext scripts & that client was more for the internal use between hobbit clients & server. But the lure of the client command is to great to ignore. Especially with the ability to access that raw data via the clientlog command.

So that brings me to my first problem when sending data via the client command the data needs to be structure in a very particular format. Where OSTYPE has to match a predefined Hobbit OSTYPE and a minimum set of SECTIONNAME need to be filled for the Hobbit server to accept the data. If they are missing the data just ends up in /dev/null. So I was curious if in the upcoming 4.3 (or newer version), that the client command could be opened up for more general usage ?

Now secondly after the data is actually accepted by the Server, the ability to do something with it is needed. Now that shouldn't be a problem with being able to attatch a script to a Hobbit channel or by scheduling a command to run clientlog every X minutes and process the data. But how does the script apply thresholds ? Now I am working under the assumption that for my custom data I would be able to add custom definitions to the hobbit-clients.cfg file. But how does my script access those thresholds ? Since Hobbit already has all the internals to acess this file, I would think that a Hobbit utility would fit the need. Maybe even a new BB command ? i.e. "bin/bb 1.2.3.4 "threshold server", where there could be an optional parameter "stats=" which could be CPU, PROCS, DISK, etc. By storing the custom ext thresholds in the hobbit-clients.cfg file, this allows the cgi hobbit-confreport.sh to report on the custom thresholds along with the built-in client thresholds.

A sample of a custom client report:
bin/bb 1.2.3.4 "client BlahJVM01.websphere was60
[date]
Thu Jan 24 09:03......
[jvm]
java version: JDK 1.4.2_12
....
[memory]
free: 88565
used: 43209
gc-calls: 39
gc-time: 4
[threadpool]
current: 0
highwater: 10
active: 0
"

Sample hobbit-clients.cfg
  WASTHRD 10 20
  WASMEMFREE 15000 5000

Sorry for the long winded email, I am just trying to move my ext scripts to the Hobbit methodology. It just makes more sense.. and offers many more options & power. And this was just my take on it. Thoughts ?

 ~Steve
list Henrik Størner · Thu, 24 Jan 2008 18:31:46 +0100 ·
quoted from S Aiello
On Thu, Jan 24, 2008 at 10:24:24AM -0500, user-ce96540ed38f@xymon.invalid wrote:
What I mean by BB legacy ext scripts, are client based ext scripts that send reports in via the bb status command. The Hobbit method is more server centric. In that clients report raw data to the server and the server processes that data, applying thresholds, formating reports, and possibly trending. Now to send raw data from the client there are two choices, data or client. Now in the past, I believe I saw a thread where Henrik mentioned that data was the preferred method for ext scripts & that client was more for the internal use between hobbit clients & server. But the lure of the client command is to great to ignore. Especially with the ability to access that raw data via the clientlog command.
I think this is the real "killer" difference between the client- and the
data-message types. If the client-data wasn't directly available, you
could just as well use a data-message.
So that brings me to my first problem when sending data via the client command the data needs to be structure in a very particular format. Where OSTYPE has to match a predefined Hobbit OSTYPE and a minimum set of SECTIONNAME need to be filled for the Hobbit server to accept the data. If they are missing the data just ends up in /dev/null. 
Eh - not quite, you can still get at it with the clientlog command. It
just doesn't generate any cpu/disk/memory... statuses because hobbitd_client - the only program that normally listens in on the Hobbit "client" channel - doesn't know how to interpret the data.

The only requirement for a client-message really is that the hostname
must be listed in bb-hosts. The operating-system type (and the optional
client "class") is used to pick out an entry from the client-local.cfg
file which is returned to the client system, but there is no requirement
that these must be pre-defined anywhere in Hobbit.

So I was curious if in the upcoming 4.3 (or newer version), that the client command could be opened up for more general usage ?
I think what you really want to be able to do is to combine client-side
data from multiple sources - e.g. you have the Hobbit client running and
feeding the normal client data into Hobbit, and then you have a number
of custom add-ons that generate data, and you want this data to be
available in Hobbit as if it had been part of the "normal" client data.
Like
    Hobbit client sends data (uptime, vmstat, ps-listing etc.)
    Add-on script A sends additional data (eg. JVM performance metrics)
    Add-on script B sends additional data (eg. custom application data)
    User views the "clientlog" info on Hobbit and sees all three

It isn't difficult to implement, most of the pieces are already there.
Some things that will need to be done in the Hobbit daemon:
- store client-ext-FOO data separately from client-ext-BAR (to know what
  to overwrite when an update arrives).
- should there be a new protocol command for extensions? I think that
  might be necessary.
- Should there be a "client-local.cfg"-like file for extensions? It
  could be useful.
- for performance reasons, it would probably make sense to have some way
  of splitting extension data off from the normal client data, and then
  run a "client" channel worker handling each of the extension scripts.
  That way we dont need to send all of the client data through all of
  the add-on handlers who will just throw it away.

Now secondly after the data is actually accepted by the Server, the ability to do something with it is needed. Now that shouldn't be a problem with being able to attatch a script to a Hobbit channel or by scheduling a command to run clientlog every X minutes and process the data. But how does the script apply thresholds ? Now I am working under the assumption that for my custom data I would be able to add custom definitions to the hobbit-clients.cfg file. But how does my script access those thresholds ? Since Hobbit already has all the internals to acess this file, I would think that a Hobbit utility would fit the need. Maybe even a new BB command ? i.e. "bin/bb 1.2.3.4 "threshold server", where there could be an optional parameter "stats=" which could be CPU, PROCS, DISK, etc. By storing the custom ext thresholds in the hobbit-clients.cfg file, this allows the cgi hobbit-confreport.sh to report on the custom thresholds along with the built-in client thresholds.
It's hard for me to see how difficult this would be to implement. I know
the client configuration handling is a bit tricky in how it's
implemented, I don't know what it takes to provide a generic interface
to this for custom client-modules.

A "BB" command to fetch the configuration? No, don't like that idea. The
hobbitd daemon should not have to bother with configuration files it
doesn't need for itself.

Another complication is that the program using the configuration info is
"long-lived", so it will have to reload the configuration when it
changes. And that has some performance implications - the first attempt
at the current client-configuration performed very poorly because the
way it scanned all of the client configuration rules was extremely slow,
simply because of all the pattern-matching of hostnames, testnames,
colors, time-specifications etc. hobbitd_client now caches the list of relevant configuration rules for each host once it has determined what they are, and this gave quite a performance boost. Having 5-10 add-on handlers making the same mistake would kill your server.

I'll have to think about this.

Sorry for the long winded email, I am just trying to move my ext scripts to the Hobbit methodology. It just makes more sense.. and offers many more options & power. And this was just my take on it. Thoughts ?
It's a good idea, and if it can be done right then it would fit in
really well with the Hobbit philosophy of providing good basic
monitoring, but also enabling you to add whatever extras you need.
So I like it, but it does require some thought to get right.


Regards,
Henrik
list S Aiello · Thu, 24 Jan 2008 18:34:47 -0500 ·
quoted from Henrik Størner
On Thursday 24 January 2008, Henrik Stoerner wrote:
On Thu, Jan 24, 2008 at 10:24:24AM -0500, user-ce96540ed38f@xymon.invalid wrote:
What I mean by BB legacy ext scripts, are client based ext scripts that
send reports in via the bb status command. The Hobbit method is more
server centric. In that clients report raw data to the server and the
server processes that data, applying thresholds, formating reports, and
possibly trending. Now to send raw data from the client there are two
choices, data or client. Now in the past, I believe I saw a thread where
Henrik mentioned that data was the preferred method for ext scripts &
that client was more for the internal use between hobbit clients &
server. But the lure of the client command is to great to ignore.
Especially with the ability to access that raw data via the clientlog
command.
I think this is the real "killer" difference between the client- and the
data-message types. If the client-data wasn't directly available, you
could just as well use a data-message.
So that brings me to my first problem when sending data via the client
command the data needs to be structure in a very particular format. Where
OSTYPE has to match a predefined Hobbit OSTYPE and a minimum set of
SECTIONNAME need to be filled for the Hobbit server to accept the data.
If they are missing the data just ends up in /dev/null.
Eh - not quite, you can still get at it with the clientlog command. It
just doesn't generate any cpu/disk/memory... statuses because
hobbitd_client - the only program that normally listens in on the Hobbit
"client" channel - doesn't know how to interpret the data.

The only requirement for a client-message really is that the hostname
must be listed in bb-hosts. The operating-system type (and the optional
client "class") is used to pick out an entry from the client-local.cfg
file which is returned to the client system, but there is no requirement
that these must be pre-defined anywhere in Hobbit.
Ah you are right...  My tests earlier this morning were failing for other reasons.
quoted from Henrik Størner
So I was curious if in the upcoming 4.3 (or newer version), that the
client command could be opened up for more general usage ?
I think what you really want to be able to do is to combine client-side
data from multiple sources - e.g. you have the Hobbit client running and
feeding the normal client data into Hobbit, and then you have a number
of custom add-ons that generate data, and you want this data to be
available in Hobbit as if it had been part of the "normal" client data.
Like
    Hobbit client sends data (uptime, vmstat, ps-listing etc.)
    Add-on script A sends additional data (eg. JVM performance metrics)
    Add-on script B sends additional data (eg. custom application data)
    User views the "clientlog" info on Hobbit and sees all three
Well that would be nice too, but actually where I am headed is multiple devices reporting from a single server. Lets say serverA runs 5 apache instances (not 5 vhosts). I would like to have in Hobbit 6 total devices; serverA, web1, web2, web3, web4, web5. So not sure if having all the clientlog data contained under serverA would be such a good idea. That clientlog could get to be rather large (I have some web servers that run 20+ apaches). So I was more thinking that each device in Hobbit would have it's own clientlog data. So clientlog data seperate or combined..  not sure.. you would be better to decide that.

I was also going to do this for my App servers. We run multiple JVMs or Apps on a given server. So base server stats are good, but it would be great to have other device entries in Hobbit that are completely focused for each JVM (just reports focused on that one JVM instance). Presently I have reports that I term Monolithic. They give reports on every JVM or apache in one report. There are some people that have issue with such large / multi app reports. When there is an alert, unclear what is the cause or when a new issue occurs it is lost due to a previous alert state from a differen JVM. By spliting the monotlithic reports into many reports for each instance, the page would then be more of a dashboard. JVM-A has a red on CPU or the DBPool for JVM-C is yellow. Then the user can drill down into the report to find out the details. Another benefit would be that I could then create Application Specific pages. On these App specific pages the application's entire tech stack; Web, App, DB, etc. could be displayed without any data from a different App. The Application teams would then have a clean page and see the health of their app at a glance.

And then if I needed semi-Monolithic, i.e. rollup the application's load balanced webservers, I can just use bbcombotest to combine the 4 web server device reports.
quoted from Henrik Størner
It isn't difficult to implement, most of the pieces are already there.
Some things that will need to be done in the Hobbit daemon:
- store client-ext-FOO data separately from client-ext-BAR (to know what
  to overwrite when an update arrives).
- should there be a new protocol command for extensions? I think that
  might be necessary.
- Should there be a "client-local.cfg"-like file for extensions? It
  could be useful.
- for performance reasons, it would probably make sense to have some way
  of splitting extension data off from the normal client data, and then
  run a "client" channel worker handling each of the extension scripts.
  That way we dont need to send all of the client data through all of
  the add-on handlers who will just throw it away.
Now secondly after the data is actually accepted by the Server, the
ability to do something with it is needed. Now that shouldn't be a
problem with being able to attatch a script to a Hobbit channel or by
scheduling a command to run clientlog every X minutes and process the
data. But how does the script apply thresholds ? Now I am working under
the assumption that for my custom data I would be able to add custom
definitions to the hobbit-clients.cfg file. But how does my script access
those thresholds ? Since Hobbit already has all the internals to acess
this file, I would think that a Hobbit utility would fit the need. Maybe
even a new BB command ? i.e. "bin/bb
1.2.3.4 "threshold server", where there could be an optional
parameter "stats=" which could be CPU, PROCS, DISK, etc. By storing the
custom ext thresholds in the hobbit-clients.cfg file, this allows the cgi
hobbit-confreport.sh to report on the custom thresholds along with the
built-in client thresholds.
It's hard for me to see how difficult this would be to implement. I know
the client configuration handling is a bit tricky in how it's
implemented, I don't know what it takes to provide a generic interface
to this for custom client-modules.

A "BB" command to fetch the configuration? No, don't like that idea. The
hobbitd daemon should not have to bother with configuration files it
doesn't need for itself.

Another complication is that the program using the configuration info is
"long-lived", so it will have to reload the configuration when it
changes. And that has some performance implications - the first attempt
at the current client-configuration performed very poorly because the
way it scanned all of the client configuration rules was extremely slow,
simply because of all the pattern-matching of hostnames, testnames,
colors, time-specifications etc. hobbitd_client now caches the list of
relevant configuration rules for each host once it has determined what
they are, and this gave quite a performance boost. Having 5-10 add-on
handlers making the same mistake would kill your server.

I'll have to think about this.
Yeah, I can only imagine the headache of trying to solve implementing powerful features and good performance. I have barely poked around with my fledging C experience. So I hope I do not come off asking for feature X, Y, Z and not appreciating the complexity. Or not jumping into the code, and trying to solve/implement myself. I was just thinking in my head of what features would make sense and how they might be implemented in the 'Hobbit way'.
quoted from Henrik Størner
Sorry for the long winded email, I am just trying to move my ext scripts
to the Hobbit methodology. It just makes more sense.. and offers many
more options & power. And this was just my take on it. Thoughts ?
It's a good idea, and if it can be done right then it would fit in
really well with the Hobbit philosophy of providing good basic
monitoring, but also enabling you to add whatever extras you need.
So I like it, but it does require some thought to get right.
I don't think I explained why I wanted to access the clientlog data, so I will try to explain. I had hoped to send more information to hobbit than just alerting stats. More like configuration & runtime information. Since I would then be monitoring most of the application stack (web, app, db) I could wrap a script around all that data, to query and provide up to the minute information on applications. For example what IPs are presently being used and for what. What http urls are which teams role to support (determined by what server the apache instance is running, etc). This would make support of my environment almost a breeze. Especially since I had planned to write the ext scripts in such a way that when a tech added web6 to serverA, the data would then be collected automatically and the device entry for web6 would be added to Hobbit automatically (deletes removals would be manual). So this extra data doesn't need to be seen in alerting reports, but it would be rather nice to query for it and know that the hobbit-clients/scripts where gathering and feeding Hobbit with the latest info.

Again, sorry for the very verbose email.
 ~Steve
list Scott Walters · Sat, 2 Feb 2008 23:40:48 -0500 ·
Stece,
  Could you perhaps create a hobbit hostname for each application you'd like
to monitor?   So in the hobbit server, you'd see APP1 and APP2 with
APP1.apache, APP1.http, and APP2.db monitors.  Then on the client have the
various data collecting scripts to change the hobbit hostname.
This would also have the added benefit if you move applications around, the
monitors/collectors move with.

Aggregating load balanced application data metrics is another nightmare, so
I could only propose APP1-instance1.metric.

-Scott
quoted from S Aiello

On Jan 24, 2008 6:34 PM, user-ce96540ed38f@xymon.invalid <user-ce96540ed38f@xymon.invalid> wrote:
On Thursday 24 January 2008, Henrik Stoerner wrote:
On Thu, Jan 24, 2008 at 10:24:24AM -0500, user-ce96540ed38f@xymon.invalid wrote:
What I mean by BB legacy ext scripts, are client based ext scripts
that
send reports in via the bb status command. The Hobbit method is more
server centric. In that clients report raw data to the server and the
server processes that data, applying thresholds, formating reports,
and
possibly trending. Now to send raw data from the client there are two
choices, data or client. Now in the past, I believe I saw a thread
where
Henrik mentioned that data was the preferred method for ext scripts &
that client was more for the internal use between hobbit clients &
server. But the lure of the client command is to great to ignore.
Especially with the ability to access that raw data via the clientlog
command.
I think this is the real "killer" difference between the client- and the
data-message types. If the client-data wasn't directly available, you
could just as well use a data-message.
So that brings me to my first problem when sending data via the client
command the data needs to be structure in a very particular format.
Where
OSTYPE has to match a predefined Hobbit OSTYPE and a minimum set of
SECTIONNAME need to be filled for the Hobbit server to accept the
data.
If they are missing the data just ends up in /dev/null.
Eh - not quite, you can still get at it with the clientlog command. It
just doesn't generate any cpu/disk/memory... statuses because
hobbitd_client - the only program that normally listens in on the Hobbit
"client" channel - doesn't know how to interpret the data.

The only requirement for a client-message really is that the hostname
must be listed in bb-hosts. The operating-system type (and the optional
client "class") is used to pick out an entry from the client-local.cfg
file which is returned to the client system, but there is no requirement
that these must be pre-defined anywhere in Hobbit.
Ah you are right...  My tests earlier this morning were failing for other
reasons.
So I was curious if in the upcoming 4.3 (or newer version), that the
client command could be opened up for more general usage ?
I think what you really want to be able to do is to combine client-side
data from multiple sources - e.g. you have the Hobbit client running and
feeding the normal client data into Hobbit, and then you have a number
of custom add-ons that generate data, and you want this data to be
available in Hobbit as if it had been part of the "normal" client data.
Like
    Hobbit client sends data (uptime, vmstat, ps-listing etc.)
    Add-on script A sends additional data (eg. JVM performance metrics)
    Add-on script B sends additional data (eg. custom application data)
    User views the "clientlog" info on Hobbit and sees all three
Well that would be nice too, but actually where I am headed is multiple
devices reporting from a single server. Lets say serverA runs 5 apache
instances (not 5 vhosts). I would like to have in Hobbit 6 total devices;
serverA, web1, web2, web3, web4, web5. So not sure if having all the
clientlog data contained under serverA would be such a good idea. That
clientlog could get to be rather large (I have some web servers that run
20+
apaches). So I was more thinking that each device in Hobbit would have
it's
own clientlog data. So clientlog data seperate or combined..  not sure..
you
would be better to decide that.

I was also going to do this for my App servers. We run multiple JVMs or
Apps
on a given server. So base server stats are good, but it would be great to
have other device entries in Hobbit that are completely focused for each
JVM
(just reports focused on that one JVM instance). Presently I have reports
that I term Monolithic. They give reports on every JVM or apache in one
report. There are some people that have issue with such large / multi app
reports. When there is an alert, unclear what is the cause or when a new
issue occurs it is lost due to a previous alert state from a differen JVM.
By
spliting the monotlithic reports into many reports for each instance, the
page would then be more of a dashboard. JVM-A has a red on CPU or the
DBPool
for JVM-C is yellow. Then the user can drill down into the report to find
out
the details. Another benefit would be that I could then create Application
Specific pages. On these App specific pages the application's entire tech
stack; Web, App, DB, etc. could be displayed without any data from a
different App. The Application teams would then have a clean page and see
the
health of their app at a glance.

And then if I needed semi-Monolithic, i.e. rollup the application's load
balanced webservers, I can just use bbcombotest to combine the 4 web
server
device reports.
It isn't difficult to implement, most of the pieces are already there.
Some things that will need to be done in the Hobbit daemon:
- store client-ext-FOO data separately from client-ext-BAR (to know what
  to overwrite when an update arrives).
- should there be a new protocol command for extensions? I think that
  might be necessary.
- Should there be a "client-local.cfg"-like file for extensions? It
  could be useful.
- for performance reasons, it would probably make sense to have some way
  of splitting extension data off from the normal client data, and then
  run a "client" channel worker handling each of the extension scripts.
  That way we dont need to send all of the client data through all of
  the add-on handlers who will just throw it away.
Now secondly after the data is actually accepted by the Server, the
ability to do something with it is needed. Now that shouldn't be a
problem with being able to attatch a script to a Hobbit channel or by
scheduling a command to run clientlog every X minutes and process the
data. But how does the script apply thresholds ? Now I am working
under
the assumption that for my custom data I would be able to add custom
definitions to the hobbit-clients.cfg file. But how does my script
access
those thresholds ? Since Hobbit already has all the internals to acess
this file, I would think that a Hobbit utility would fit the need.
Maybe
even a new BB command ? i.e. "bin/bb
1.2.3.4 "threshold server", where there could be an optional
parameter "stats=" which could be CPU, PROCS, DISK, etc. By storing
the
custom ext thresholds in the hobbit-clients.cfg file, this allows the
cgi
hobbit-confreport.sh to report on the custom thresholds along with the
built-in client thresholds.
It's hard for me to see how difficult this would be to implement. I know
the client configuration handling is a bit tricky in how it's
implemented, I don't know what it takes to provide a generic interface
to this for custom client-modules.

A "BB" command to fetch the configuration? No, don't like that idea. The
hobbitd daemon should not have to bother with configuration files it
doesn't need for itself.

Another complication is that the program using the configuration info is
"long-lived", so it will have to reload the configuration when it
changes. And that has some performance implications - the first attempt
at the current client-configuration performed very poorly because the
way it scanned all of the client configuration rules was extremely slow,
simply because of all the pattern-matching of hostnames, testnames,
colors, time-specifications etc. hobbitd_client now caches the list of
relevant configuration rules for each host once it has determined what
they are, and this gave quite a performance boost. Having 5-10 add-on
handlers making the same mistake would kill your server.

I'll have to think about this.
Yeah, I can only imagine the headache of trying to solve implementing
powerful
features and good performance. I have barely poked around with my fledging
C
experience. So I hope I do not come off asking for feature X, Y, Z and not
appreciating the complexity. Or not jumping into the code, and trying to
solve/implement myself. I was just thinking in my head of what features
would
make sense and how they might be implemented in the 'Hobbit way'.
Sorry for the long winded email, I am just trying to move my ext
scripts
to the Hobbit methodology. It just makes more sense.. and offers many
more options & power. And this was just my take on it. Thoughts ?
It's a good idea, and if it can be done right then it would fit in
really well with the Hobbit philosophy of providing good basic
monitoring, but also enabling you to add whatever extras you need.
So I like it, but it does require some thought to get right.
I don't think I explained why I wanted to access the clientlog data, so I
will
try to explain. I had hoped to send more information to hobbit than just
alerting stats. More like configuration & runtime information. Since I
would
then be monitoring most of the application stack (web, app, db) I could
wrap
a script around all that data, to query and provide up to the minute
information on applications. For example what IPs are presently being used
and for what. What http urls are which teams role to support (determined
by
what server the apache instance is running, etc). This would make support
of
my environment almost a breeze. Especially since I had planned to write
the
ext scripts in such a way that when a tech added web6 to serverA, the data
would then be collected automatically and the device entry for web6 would
be
added to Hobbit automatically (deletes removals would be manual). So this
extra data doesn't need to be seen in alerting reports, but it would be
rather nice to query for it and know that the hobbit-clients/scripts where
gathering and feeding Hobbit with the latest info.

Again, sorry for the very verbose email.
 ~Steve