Xymon Mailing List Archive search

conn alerts based on ping time

12 messages in this thread

list Charles Jones · Fri, 13 Jan 2006 11:01:11 -0700 ·
I'm helping someone set up Hobbit at their company, and they want to monitor the status of a remote office T1 link.  Of course Hobbit can tell them if the link goes totally down, or you can ignore bad pings with "badconn",  but they want to know when the link is *slow*, as they often have periods of time when the pings are not dropped, but instead taking 1-3 seconds (instead of <100ms like normal). 
Is there any chance that Hobbit will soon support comparing the ping replies to specifiied values for green, yellow, and red?

Somethign like:

1.2.3.4 myhost.com # conn:200:500

This would make myhost.com's conn test go yellow if the ping was between 200 and 500ms, and red if it was over 500ms.
Since hobbit already graphs the numeric values of the ping replies, this seems like it would be fairly easy to add?

-Charles
list Richard Deal · Fri, 13 Jan 2006 13:14:03 -0500 ·
Sounds like they need to through in MRTG and go red when the traffic is
high on the link.

And then throw in things like 

bb-ospf.pl to check that ospf is not flapping over the link

bb-xsnmp.pl to check out the routers at each end and the interfaces

 
you can also use http to a reliable server on the remote side as part of
the link test.  Just make the http test for the link dependent on the
router and the conn test to the web server.
quoted from Charles Jones

 
From: Charles Jones [mailto:user-e86b4aeade4e@xymon.invalid] 
Sent: Friday, January 13, 2006 1:01 PM
To: user-ae9b8668bcde@xymon.invalid
Cc: user-9ae1c8bba037@xymon.invalid
Subject: [hobbit] conn alerts based on ping time

 
I'm helping someone set up Hobbit at their company, and they want to
monitor the status of a remote office T1 link.  Of course Hobbit can
tell them if the link goes totally down, or you can ignore bad pings
with "badconn",  but they want to know when the link is slow, as they
often have periods of time when the pings are not dropped, but instead
taking 1-3 seconds (instead of <100ms like normal).  

Is there any chance that Hobbit will soon support comparing the ping
replies to specifiied values for green, yellow, and red?

Somethign like:

1.2.3.4 myhost.com # conn:200:500

This would make myhost.com's conn test go yellow if the ping was between
200 and 500ms, and red if it was over 500ms.
Since hobbit already graphs the numeric values of the ping replies, this
seems like it would be fairly easy to add?

-Charles
list Charles Jones · Fri, 13 Jan 2006 11:26:28 -0700 ·
quoted from Richard Deal
Deal, Richard wrote:
Sounds like they need to through in MRTG and go red when the traffic is high on the link.

And then throw in things like

bb-ospf.pl to check that ospf is not flapping over the link

bb-xsnmp.pl to check out the routers at each end and the interfaces
Yeah I'm aware of the existance of bb-mrtg.pl, although I have never set it up.  I guess I was hoping that Hobbit could natively support ping testing rather than having to install mrtg and hack stuff in.  Its sort of confusing for a newbie when you are showing them the ropes of Hobbit and start bringing external scripts into the mix (especially ones that require modifying before they will work).
 
you can also use http to a reliable server on the remote side as part of the link test.  Just make the http test for the link dependent on the router and the conn test to the web server.

 
That won't work in this case as all of the companies servers are in a CoLo, Hobbit is running at the CoLo, and they want to test the T1 link at the office from the CoLo (there are no servers on the other side of the office T1 to do a test against), and even if there was, it still would not give them a heads-up to the T1 being slow/saturated, as Hobbit only alerts when the conn test outright fails.

-Charles
quoted from Richard Deal
 
*From:* Charles Jones [mailto:user-e86b4aeade4e@xymon.invalid]
*Sent:* Friday, January 13, 2006 1:01 PM
*To:* user-ae9b8668bcde@xymon.invalid
*Cc:* user-9ae1c8bba037@xymon.invalid
*Subject:* [hobbit] conn alerts based on ping time

 
I'm helping someone set up Hobbit at their company, and they want to monitor the status of a remote office T1 link.  Of course Hobbit can tell them if the link goes totally down, or you can ignore bad pings with "badconn",  but they want to know when the link is *slow*, as they often have periods of time when the pings are not dropped, but instead taking 1-3 seconds (instead of <100ms like normal). 
Is there any chance that Hobbit will soon support comparing the ping replies to specifiied values for green, yellow, and red?

Somethign like:

1.2.3.4 myhost.com # conn:200:500

This would make myhost.com's conn test go yellow if the ping was between 200 and 500ms, and red if it was over 500ms.
Since hobbit already graphs the numeric values of the ping replies, this seems like it would be fairly easy to add?

-Charles
list Jeff Newman · Fri, 13 Jan 2006 15:12:06 -0600 ·
Really, honestly, im not trying to belabor a point here, but you need to be
careful as the ping only runs every 5 minutes, so even if you could get this
alerting to work, the link would have to be slow during a ping cycle. So it
could possible be slow for 4 minutes, recover, and the page wouldn't happen,
as the ping time would be ok. Assuming the client saw the slowness during
those 4 minutes via other methods, they would then question why hobbit
didn't see it.

Same thing hapens to me with spikes in network traffic between polling
periods, I don't see them.

With MRTG, you can shorten the time to 1 minute. MRTG integration with
hobbit isn't too hard, so thats probably the route you should go.

-Jeff
quoted from Charles Jones


On 1/13/06, Charles Jones <user-e86b4aeade4e@xymon.invalid> wrote:
Deal, Richard wrote:

 Sounds like they need to through in MRTG and go red when the traffic is
high on the link.

And then throw in things like

bb-ospf.pl to check that ospf is not flapping over the link

bb-xsnmp.pl to check out the routers at each end and the interfaces

Yeah I'm aware of the existance of bb-mrtg.pl, although I have never set
it up.  I guess I was hoping that Hobbit could natively support ping testing
rather than having to install mrtg and hack stuff in.  Its sort of confusing
for a newbie when you are showing them the ropes of Hobbit and start
bringing external scripts into the mix (especially ones that require
modifying before they will work).


you can also use http to a reliable server on the remote side as part of
the link test.  Just make the http test for the link dependent on the router
and the conn test to the web server.


That won't work in this case as all of the companies servers are in a
CoLo, Hobbit is running at the CoLo, and they want to test the T1 link at
the office from the CoLo (there are no servers on the other side of the
office T1 to do a test against), and even if there was, it still would not
give them a heads-up to the T1 being slow/saturated, as Hobbit only alerts
when the conn test outright fails.

-Charles


*From:* Charles Jones [mailto:user-e86b4aeade4e@xymon.invalid <user-e86b4aeade4e@xymon.invalid>]
quoted from Charles Jones
*Sent:* Friday, January 13, 2006 1:01 PM
*To:* user-ae9b8668bcde@xymon.invalid
*Cc:* user-9ae1c8bba037@xymon.invalid
*Subject:* [hobbit] conn alerts based on ping time


I'm helping someone set up Hobbit at their company, and they want to
monitor the status of a remote office T1 link.  Of course Hobbit can tell
them if the link goes totally down, or you can ignore bad pings with
"badconn",  but they want to know when the link is *slow*, as they often
have periods of time when the pings are not dropped, but instead taking 1-3
seconds (instead of <100ms like normal).

Is there any chance that Hobbit will soon support comparing the ping
replies to specifiied values for green, yellow, and red?

Somethign like:

1.2.3.4 myhost.com # conn:200:500

This would make myhost.com's conn test go yellow if the ping was between
200 and 500ms, and red if it was over 500ms.
Since hobbit already graphs the numeric values of the ping replies, this
seems like it would be fairly easy to add?

-Charles

list Charles Jones · Fri, 13 Jan 2006 14:59:06 -0700 ·
Jeff,

That's a very good point. Do you know if anyone has documented setting up MRTG with Hobbit?  I searched the mailing list archives and didn't find anything concise. 
I will probably end up recommending and implementing the MRTG solution, but I still think it should be trivial for hobbit to alert on the ping response, since it already collects that data. I guess a real solution would be for hobbit to come with an MRTG module "out of the box" so that users didn't have to delve into the knowledgebase and/or depend on places like deadcat to find and provide the functionality they need.

I myself don't mind using external scripts and having to tinker with something to get it to work the way I want, but its hard enough to sell management on Hobbit over commercial and well known tools like Nagios, without having to reveal that you need to spend a day downloading external scripts and making them work in order to get the functionality that they expect (and that they think that the commercial tools already have).

I believe that a well-setup hobbit monitor is superior to Nagios and other tools I have tested and been forced to use over the years. But the fact that a lot of the application-specific monitoring (mysql, oracle, postgres, etc), as well as traffic monitoring (MRTG) is handled by third-party scripts that you have to meld into your server probably scares away a lot of people, especially management types who have security folks whispering in their ear to never trust third-party modules and especially not code written by "joe-user from some website" (a manager actually said that to me once).  As of yet Hobbit does not even have a fully functional client (no logfile parsing), so we have to use either the bb-client or the bb-msgs script....more third party plugins.

I'm not sure where I'm going with this, I guess what I'm saying is I would like to see Hobbit come with built-in support for monitoring common applications and services (besides the basics). It's already partway there as Hobbit can natively check things like mysql, but what about postgres, oracle?

Henrik is a busy guy I am sure, and he probably doesn't get much compensation for all the fine work he does on Hobbit, nor does he ask for any (I did buy him one of his wishlist items, I hope others do as well). As far as I know, Henrik has nobody helping him, except for seeing him mention someone was working on a new Hobbit client. Maybe what we need is more people to roll up their sleeves and write some modules that are compatible with hobbit with little or no tweaking. Sadly I'm no C/C++ guru, but I am pretty good with Perl :-) 
I think also perhaps we need an "official" repository of scripts that work with Hobbit, so when someone needs an addon, they can grab an already Hobbit-ized one, instead of going to deadcat and getting a script to hack on. Also a Wiki might be handy, so that Hobbit users can easily share and update information on various Hobbit setups and problems.

Okay, I have written WAY more than I intended here, I'm so far off topic now that I will edit the subject line as a warning :)

-Charles
quoted from Jeff Newman

Jeff Newman wrote:
Really, honestly, im not trying to belabor a point here, but you need to be careful as the ping only runs every 5 minutes, so even if you could get this alerting to work, the link would have to be slow during a ping cycle. So it could possible be slow for 4 minutes, recover, and the page wouldn't happen, as the ping time would be ok. Assuming the client saw the slowness during those 4 minutes via other methods, they would then question why hobbit didn't see it.
 Same thing hapens to me with spikes in network traffic between polling periods, I don't see them.
 With MRTG, you can shorten the time to 1 minute. MRTG integration with hobbit isn't too hard, so thats probably the route you should go.
 -Jeff


 On 1/13/06, *Charles Jones* <user-e86b4aeade4e@xymon.invalid <mailto:user-e86b4aeade4e@xymon.invalid>> wrote:

    Deal, Richard wrote:
    Sounds like they need to through in MRTG and go red when the
    traffic is high on the link.

    And then throw in things like

    bb-ospf.pl to check that ospf is not flapping over the link

    bb-xsnmp.pl to check out the routers at each end and the interfaces
    Yeah I'm aware of the existance of bb-mrtg.pl, although I have
    never set it up.  I guess I was hoping that Hobbit could natively
    support ping testing rather than having to install mrtg and hack
    stuff in.  Its sort of confusing for a newbie when you are showing
    them the ropes of Hobbit and start bringing external scripts into
    the mix (especially ones that require modifying before they will
    work).
     
    you can also use http to a reliable server on the remote side as
    part of the link test.  Just make the http test for the link
    dependent on the router and the conn test to the web server.

     
    That won't work in this case as all of the companies servers are
    in a CoLo, Hobbit is running at the CoLo, and they want to test
    the T1 link at the office from the CoLo (there are no servers on
    the other side of the office T1 to do a test against), and even if
    there was, it still would not give them a heads-up to the T1 being
    slow/saturated, as Hobbit only alerts when the conn test outright
    fails.

    -Charles
     
    *From:* Charles Jones [mailto:user-e86b4aeade4e@xymon.invalid]
    *Sent:* Friday, January 13, 2006 1:01 PM

    *To:* user-ae9b8668bcde@xymon.invalid <mailto:user-ae9b8668bcde@xymon.invalid>
    *Cc: * user-9ae1c8bba037@xymon.invalid <mailto:user-9ae1c8bba037@xymon.invalid>
quoted from Jeff Newman
    *Subject:* [hobbit] conn alerts based on ping time

     
    I'm helping someone set up Hobbit at their company, and they want
    to monitor the status of a remote office T1 link.  Of course
    Hobbit can tell them if the link goes totally down, or you can
    ignore bad pings with "badconn",  but they want to know when the
    link is *slow*, as they often have periods of time when the pings
    are not dropped, but instead taking 1-3 seconds (instead of
    <100ms like normal). 
    Is there any chance that Hobbit will soon support comparing the
    ping replies to specifiied values for green, yellow, and red?

    Somethign like:

    1.2.3.4 <http://1.2.3.4/>; myhost.com <http://myhost.com/>; #
    conn:200:500

    This would make myhost.com <http://myhost.com/>'s conn test go
quoted from Jeff Newman
    yellow if the ping was between 200 and 500ms, and red if it was
    over 500ms.
    Since hobbit already graphs the numeric values of the ping
    replies, this seems like it would be fairly easy to add?

    -Charles
list Jeff Newman · Tue, 17 Jan 2006 14:12:37 -0600 ·
Charles,

hobbit is already pretty much integrated with MRTG. Sure, you  have to
install MRTG and configure the mrtg.cfg, but that's pretty simple. Other
than that, hobbit is already coded to look for MRTG
RRD files. See the tips&tricks section under help. There is a document that
describes setting up
MRTG with hobbit. I also looked at bb-mrtg and found it daunting to try and
figure out how it all works together with hobbit, so I stuck with the
standard integration. I also didn't like the fact that plain MRTG with
bb-mrtg seemed to not use RRD and stuck with the default way MRTG works (ala
lots of .png files) but i could be wrong about that since I ahve never done
it.

Anyway in short, putting MRTG (a well known easy to install program) on a
the hobbit server and following the instructions in the help pages took me
less than 10-15 minutes to get up and running (I had already downloaded the
linux MRTG rpm and installed it)
quoted from Charles Jones

-Jeff

On 1/13/06, Charles Jones <user-e86b4aeade4e@xymon.invalid> wrote:
Jeff,

That's a very good point. Do you know if anyone has documented setting up
MRTG with Hobbit?  I searched the mailing list archives and didn't find
anything concise.

I will probably end up recommending and implementing the MRTG solution,
but I still think it should be trivial for hobbit to alert on the ping
response, since it already collects that data. I guess a real solution would
be for hobbit to come with an MRTG module "out of the box" so that users
didn't have to delve into the knowledgebase and/or depend on places like
deadcat to find and provide the functionality they need.

I myself don't mind using external scripts and having to tinker with
something to get it to work the way I want, but its hard enough to sell
management on Hobbit over commercial and well known tools like Nagios,
without having to reveal that you need to spend a day downloading external
scripts and making them work in order to get the functionality that they
expect (and that they think that the commercial tools already have).

I believe that a well-setup hobbit monitor is superior to Nagios and other
tools I have tested and been forced to use over the years. But the fact that
a lot of the application-specific monitoring (mysql, oracle, postgres, etc),
as well as traffic monitoring (MRTG) is handled by third-party scripts that
you have to meld into your server probably scares away a lot of people,
especially management types who have security folks whispering in their ear
to never trust third-party modules and especially not code written by
"joe-user from some website" (a manager actually said that to me once).  As
of yet Hobbit does not even have a fully functional client (no logfile
parsing), so we have to use either the bb-client or the bb-msgs
script....more third party plugins.

I'm not sure where I'm going with this, I guess what I'm saying is I would
like to see Hobbit come with built-in support for monitoring common
applications and services (besides the basics). It's already partway there
as Hobbit can natively check things like mysql, but what about postgres,
oracle?

Henrik is a busy guy I am sure, and he probably doesn't get much
compensation for all the fine work he does on Hobbit, nor does he ask for
any (I did buy him one of his wishlist items, I hope others do as well). As
far as I know, Henrik has nobody helping him, except for seeing him mention
someone was working on a new Hobbit client. Maybe what we need is more
people to roll up their sleeves and write some modules that are compatible
with hobbit with little or no tweaking. Sadly I'm no C/C++ guru, but I am
pretty good with Perl :-)

I think also perhaps we need an "official" repository of scripts that work
with Hobbit, so when someone needs an addon, they can grab an already
Hobbit-ized one, instead of going to deadcat and getting a script to hack
on. Also a Wiki might be handy, so that Hobbit users can easily share and
update information on various Hobbit setups and problems.

Okay, I have written WAY more than I intended here, I'm so far off topic
now that I will edit the subject line as a warning :)

-Charles

Jeff Newman wrote:

Really, honestly, im not trying to belabor a point here, but you need to
be careful as the ping only runs every 5 minutes, so even if you could get
this alerting to work, the link would have to be slow during a ping cycle.
So it could possible be slow for 4 minutes, recover, and the page wouldn't
happen, as the ping time would be ok. Assuming the client saw the slowness
during those 4 minutes via other methods, they would then question why
hobbit didn't see it.

Same thing hapens to me with spikes in network traffic between polling
periods, I don't see them.

With MRTG, you can shorten the time to 1 minute. MRTG integration with
hobbit isn't too hard, so thats probably the route you should go.

-Jeff


On 1/13/06, Charles Jones <user-e86b4aeade4e@xymon.invalid> wrote:
Deal, Richard wrote:

 Sounds like they need to through in MRTG and go red when the traffic is
high on the link.

And then throw in things like

bb-ospf.pl to check that ospf is not flapping over the link

bb-xsnmp.pl to check out the routers at each end and the interfaces

Yeah I'm aware of the existance of bb-mrtg.pl, although I have never set
it up.  I guess I was hoping that Hobbit could natively support ping testing
rather than having to install mrtg and hack stuff in.  Its sort of confusing
for a newbie when you are showing them the ropes of Hobbit and start
bringing external scripts into the mix (especially ones that require
modifying before they will work).


you can also use http to a reliable server on the remote side as part of
the link test.  Just make the http test for the link dependent on the router
and the conn test to the web server.


That won't work in this case as all of the companies servers are in a
CoLo, Hobbit is running at the CoLo, and they want to test the T1 link at
the office from the CoLo (there are no servers on the other side of the
office T1 to do a test against), and even if there was, it still would not
give them a heads-up to the T1 being slow/saturated, as Hobbit only alerts
when the conn test outright fails.

-Charles


*From:* Charles Jones [mailto:user-e86b4aeade4e@xymon.invalid <user-e86b4aeade4e@xymon.invalid>]
*Sent:* Friday, January 13, 2006 1:01 PM
*To:* user-ae9b8668bcde@xymon.invalid
*Cc: *user-9ae1c8bba037@xymon.invalid
*Subject:* [hobbit] conn alerts based on ping time


I'm helping someone set up Hobbit at their company, and they want to
monitor the status of a remote office T1 link.  Of course Hobbit can tell
them if the link goes totally down, or you can ignore bad pings with
"badconn",  but they want to know when the link is *slow*, as they often
have periods of time when the pings are not dropped, but instead taking 1-3
seconds (instead of <100ms like normal).

Is there any chance that Hobbit will soon support comparing the ping
replies to specifiied values for green, yellow, and red?

Somethign like:

1.2.3.4 myhost.com # conn:200:500

This would make myhost.com's conn test go yellow if the ping was between
200 and 500ms, and red if it was over 500ms.
Since hobbit already graphs the numeric values of the ping replies, this
seems like it would be fairly easy to add?

-Charles

list Henrik Størner · Wed, 18 Jan 2006 22:50:54 +0100 ·
I started replying to Charles' mail a couple of days ago, then
waited for a few days before coming back to it. This does cover
a lot of different areas so I'm rambling a bit, but bear with me
for continuing off the tangent that Charles set out on.
quoted from Charles Jones

On Fri, Jan 13, 2006 at 02:59:06PM -0700, Charles Jones wrote:
I believe that a well-setup hobbit monitor is superior to Nagios and 
other tools I have tested and been forced to use over the years. But the 
fact that a lot of the application-specific monitoring (mysql, oracle, 
postgres, etc), as well as traffic monitoring (MRTG) is handled by 
third-party scripts that you have to meld into your server probably 
scares away a lot of people, especially management types who have 
security folks whispering in their ear to never trust third-party 
modules and especially not code written by "joe-user from some website" 
(a manager actually said that to me once).  As of yet Hobbit does not 
even have a fully functional client (no logfile parsing), so we have to 
use either the bb-client or the bb-msgs script....more third party plugins.
True. It's one of the things that need to be fixed soon.

I understand your management concerns - I've heard the same stuff 
over the years. I am fortunate to have a superior who saw the
potential for BB and then Hobbit, because Unicenter/TNG couldn't do
what we needed - and stuck with that decision for three years until
everyone could see that Hobbit is a very good solution to our 
monitoring needs.

One thing that I've learned over the years is that to get your
management interested in Hobbit, you must show them something that they
can see is useful. Geeks like myself often get lost in the wonders of
technology - "look, it can match the login JSP against this regular
expression so we can see if all of the EJB ressources are OK!" ...
forget it when talking to bosses. What they want is graphs, reports,
and the knowledge that whenever something happens that might affect
their bonus, one of the techies will be alerted and take action.

Big Brother was a "geek thing" when I started using it. OK, it was
better than Unicenter because you could check if the websites we
hosted were OK without any special monitoring console installed.
But it didn't *really* catch on until I added LARRD to get the
response-time graphs, and started generating reports that showed
the monthly availability. That's when management found out that
"hey, this looks good" and gave the OK for me to work on it. I still
nurse them a bit on occasion - just last week, they complained that
a particular kind of reporting to the customer was difficult, so I
came up with a tool that pulled all of the data from Hobbit and 
presented it so that it could be cut-and-pasted into a Word document.
A simple one-hour job, but it gives immense credit.

So - listen to your PHB's and try to figure out what it is that they
*really* want for a killer feature. If it turns out Hobbit cannot
do that out of the box, speak up - it's happened more than once that
an essential improvement required only a few lines of code in the
right spot. Showing people how quickly Open Source tools can adapt
to *your* needs is pretty powerful.
quoted from Jeff Newman

I'm not sure where I'm going with this, I guess what I'm saying is I 
would like to see Hobbit come with built-in support for monitoring 
common applications and services (besides the basics). It's already 
partway there as Hobbit can natively check things like mysql, but what 
about postgres, oracle?
Would be nice, yes. It can do the Oracle TNS listener, but detailed
Oracle checks are missing. Or PostgreSQL, for that matter.

I'm a bit cautious about making the "core" Hobbit tools know everything
about anything out there. It would be a maintenance nightmare since
there would be lots of stuff that I have no way of testing myself.
The add-on mechanism is a bit more work for the admin, but I think
people need monitoring for lots of very diverse stuff, and trying to
cover all of it would just result in lots of not-very-good solutions.
So what I prefer to do is to make Hobbit flexible enough that writing
add-ons is easy, and therefore the people who *know* about what's
interesting to look at in a DB2 database installation (or whatever it
is they want to monitor) can put in the missing pieces and come up
with a good monitoring solution.

It's also a lot easier to weed out the bugs in an add-on module,
so it it turns out to be generally useful, I have something
that has been debugged which I can merge into Hobbit as part of
the core toolset.


This brings us to the issue of a repository for Hobbit add-ons:

One thing I really could need some help with is setting up a web
repository like deadcat for Hobbit things. Sourceforge could host 
it, I suppose, but it would need someone to set it up and manage
submissions - I don't think Sourceforge has anything automated
like deadcat. 
quoted from Jeff Newman

Henrik is a busy guy I am sure, and he probably doesn't get much 
compensation for all the fine work he does on Hobbit, nor does he ask 
for any (I did buy him one of his wishlist items, I hope others do as 
well). As far as I know, Henrik has nobody helping him, except for 
seeing him mention someone was working on a new Hobbit client. Maybe 
what we need is more people to roll up their sleeves and write some 
modules that are compatible with hobbit with little or no tweaking. 
Thanks :-) Yes, I am fairly busy - got a job to attend to on occasion -
so you are right that I could use some help with add-ons for Hobbit.

A colleague of mine *is* working on the Win32 client, so that is well 
underway. He's got some data flowing now, but there are still some
pieces missing.

Sadly I'm no C/C++ guru, but I am pretty good with Perl :-) 
I'm just the opposite :-) But that shouldn't keep you back - as long
as you "only" write add-on modules there's a great deal of freedom
in choosing your tools. Even Hobbit server modules - those that get
their input directly from the hobbitd daemon - can be written in any
language, since the interface is just reading standard-input. Add-on
tests obviously just use the "bb" commandline tool to send their
results.


I do appreciate anyone helping with improving Hobbit. But I am also
concerned about becoming a bottleneck for getting things published.
That's why I would like to have this repository setup so there's an
easy way of publishing add-ons, without having to wait for me. If
some of you want to gang up and do something together, I can
quickly setup a dedicated mailing list for you - if that makes it
easier.

(There are over 300 adresses on the hobbit mailing list now ... a
year ago, I was proud when it passed the subscription #100. And almost
1500 downloads of the latest version from sourceforge - it's getting
big).


The past 18 months have been pretty intense - Hobbit has developed
very quickly. I think that will continue for another year or so; there
is some stuff I see as needing work right now:
- the client package needs logfile monitoring badly.
- the alert/acknowledge mechanism needs improving, so it can handle
  things like escalating alerts and different groups acknowledging
  an alert (this is in fact alreay being worked on).
- the graph displays (which graphs go on which pages) needs an
  overhaul. The current system is a bit of a mess, and not flexible
  enough.
- I want to be able to trigger status-changes (and hence alerts)
  from the data that only goes into the graphs, currently. E.g. instead
  of the CPU alert triggering if the load average goes above 5 (which
  is pretty meaningless nowadays), I'd like it to trigger a warning if
  the %system time exceeds 20, or the %idle goes below 10. Or a "conn"
  alert if pingtime exceeds 250 ms. (I have a pretty solid idea about
  how this can be implemented - and it's elegant enough that it would
  also work for data from custom graphs).
- And I'd like to make the webpages 100% dynamic and ditch the
  statically generated overview pages. Which could mean that Hobbit
  would require something like PHP for the display part, or that I 
  need to learn about how XML/XSLT etc. works.

So I won't get bored right away, but I do hope that development 
could slow down just a bit and there would be more activity just to
broaden the range of systems/applications/whatever that Hobbit can
monitor.


Henrik
list Henrik Størner · Wed, 18 Jan 2006 23:23:31 +0100 ·
quoted from Charles Jones
On Fri, Jan 13, 2006 at 11:01:11AM -0700, Charles Jones wrote:
Is there any chance that Hobbit will soon support comparing the ping 
replies to specifiied values for green, yellow, and red?
"soon" is asking for a lot :-)

As I wrote in another thread - yes, that is going to show up in Hobbit
sometime. It won't be specifically for the "conn" test, but rather I
am planning a facility so that all of the data that goes into the RRD's
can trigger a change of status color. And this would then in turn
trigger all of the normal effects like alerts, history logging and such.
quoted from Jeff Newman
Somethign like:

1.2.3.4 myhost.com # conn:200:500

This would make myhost.com's conn test go yellow if the ping was between 
200 and 500ms, and red if it was over 500ms.
The bb-hosts file is getting overloaded, and at some point we'll have to
think of a better way of configuring Hobbit. But yes - some way of
setting thresholds for these data and changing the status color if
they are exceeded.
quoted from Jeff Newman
Since hobbit already graphs the numeric values of the ping replies, this 
seems like it would be fairly easy to add?
As always, the devil's in the detail. The particular reason why this is
not as easy as it sounds is that the color of a network test status is
currently determined exclusively by the network test tool - which may
run on a separate host from the main hobbit server, which is the one
that looks at the network test output and picks out the ping time that
goes into the graph.

So by the time Hobbit learns what the ping time is, we're long past
the point where the status color has been decided.

The idea then is to have some sort of status color modification
mechanism - so even though the network tester reports "green", the
RRD module can tell Hobbit "no, make it be red because the ping
time is higher than 3 seconds".

I just need to come up with a good way of configuring it, and design
the status-override mechanism to work sanely with history logs and
such (you don't want the RRD override to show up in the history logs
in a way so the status flip-flops between green from the network
tester and red from the RRD override).


Henrik
list Rich Smrcina · Wed, 18 Jan 2006 22:01:25 -0600 ·
If I missed some other part of the discussion, I apologize.  Couldn't this just be handled by the code that analyzes the fping results?

For hosts that need to respond in a certain amount of time, maybe add a number to the conn test indicator:

1.1.1.1 myhost # conn:3

Without the above specification everything would work as before, but with it if a ping to myhost goes more than 3 seconds generate a red condition.
quoted from Henrik Størner

Henrik Stoerner wrote:

I just need to come up with a good way of configuring it, and design
the status-override mechanism to work sanely with history logs and
such (you don't want the RRD override to show up in the history logs
in a way so the status flip-flops between green from the network
tester and red from the RRD override).

-- 

Rich Smrcina
VM Assist, Inc.
Main: (262)392-2026
Cell: (XXX)XXX-XXXX
Ans Service:  (360)715-2467
user-61add9955ef9@xymon.invalid

Catch the WAVV!  http://www.wavv.org
WAVV 2006 - Chattanooga, TN - April 7-11, 2006
list Henrik Størner · Thu, 19 Jan 2006 07:12:03 +0100 ·
quoted from Rich Smrcina
On Wed, Jan 18, 2006 at 10:01:25PM -0600, Rich Smrcina wrote:
If I missed some other part of the discussion, I apologize.  Couldn't 
this just be handled by the code that analyzes the fping results?
It could, but I'd rather have a general solution to the problem of
"how do you alert based on the performance data" than a specific
solution that only does the "alert me when the ping time is too
great".


Henrik
list Michael Nemeth · Thu, 19 Jan 2006 07:13:37 -0500 ·
First incoming from the mailing list are being blocked  so I  have to read the archive! I hope I can still
send!
- I want to be able to trigger status-changes (and hence alerts) from the data that only goes into the graphs, currently. E.g. instead of >the CPU alert triggering if the load average goes above 5 (which is pretty meaningless nowadays), I'd like it to trigger a warning if the >%system time exceeds 20, or the %idle goes below 10. Or a "conn" alert if pingtime exceeds 250 ms. (I have a pretty solid idea about >how this can be implemented - and it's elegant enough that it would also work for data from custom graphs).
Aslo how about per-centage change , that cpu changed by 50% o disk change by 20 % either for  up or down or both way . See  http://www.i-pi.com/watcher.html
might be a good time to do this too.
-- 
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
|     _p_       Mike Nemeth
|  ___| |_____  email(w) user-609d3fab5b2d@xymon.invalid Work: XXX XXX-XXXX          |><___________)          |               Home Page:http://www.geocities.com/mjnemeth/
|               Work Page:http://faraday.motown.lmco.com:3000/~nemethm/ |               Work Page:http://ortsweb/~mnemeth/ |++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
list Scott Walters · Thu, 19 Jan 2006 15:03:58 -0500 (EST) ·
quoted from Michael Nemeth
Aslo how about per-centage change , that cpu changed by 50% o disk change by 20 % either for  up or down or both way .
See  http://www.i-pi.com/watcher.html
might be a good time to do this too.
The abberrant detection RRAs in rrdtool would be the easiest way to
accomplish this.  but that gets pretty complicated quickly.


-- 
Scott Walters
-PacketPusher