Xymon Mailing List Archive search

resend: 2 questions

17 messages in this thread

list Jeff Newman · Fri, 18 Jul 2008 09:59:41 -0500 ·
Hi,

didn't see a reply, so thought i'd do a resend in case it got lost in
the shuffle

Hi All,

Two questions:

QUESTION #1: Is it possible to have a third color alert? Meaning:

One of my customers wants a setup like this:

Custom script runs on client server, reports:

foo : 80

for example.

They want less than 85 to be green, 85-90 yellow, 90-95 red, and above
95 any color, say orange.
So far as I can tell, I can only use green, yellow, and red for
alerts, and blue and purple are reserved.

QUESTION #2:

lets say #1 above is possible, so my script sends hobbit the status
line based on the it sees, with the
status of green, yellow, red, and orange. The hobbit server recieves
it, and uses the NCV module to build the rrd etc..
In hobbit-alerts.cfg to say does the SERVICE keyword work for custom
NCV type columns?

Thanks.
list Gary Baluha · Fri, 18 Jul 2008 11:06:20 -0400 ·
On Fri, Jul 18, 2008 at 10:59 AM, Jeff Newman <user-e96740e73ca8@xymon.invalid>
quoted from Jeff Newman
wrote:
Hi,

didn't see a reply, so thought i'd do a resend in case it got lost in
the shuffle

Hi All,

Two questions:

QUESTION #1: Is it possible to have a third color alert? Meaning:

One of my customers wants a setup like this:

Custom script runs on client server, reports:

foo : 80

for example.

They want less than 85 to be green, 85-90 yellow, 90-95 red, and above
95 any color, say orange.
So far as I can tell, I can only use green, yellow, and red for
alerts, and blue and purple are reserved.

Currently, no.  But it might help to understand why 4 alert levels are
desired.
quoted from Jeff Newman

QUESTION #2:
lets say #1 above is possible, so my script sends hobbit the status
line based on the it sees, with the
status of green, yellow, red, and orange. The hobbit server recieves
it, and uses the NCV module to build the rrd etc..
In hobbit-alerts.cfg to say does the SERVICE keyword work for custom
NCV type columns?

The SERVICE tag in hobbit-alerts.cfg works for any column name, NCV or
otherwise.
list Michael Nemeth · Fri, 18 Jul 2008 11:52:59 -0400 ·
One case I can think of is for even 100% you've  lots of but if you hits 
0 free you HAVE to do
some thing!
quoted from Gary Baluha

Gary Baluha wrote:
On Fri, Jul 18, 2008 at 10:59 AM, Jeff Newman <user-e96740e73ca8@xymon.invalid 
<mailto:user-e96740e73ca8@xymon.invalid>> wrote:

    Hi,

    didn't see a reply, so thought i'd do a resend in case it got lost in
    the shuffle

    Hi All,

    Two questions:

    QUESTION #1: Is it possible to have a third color alert? Meaning:

    One of my customers wants a setup like this:

    Custom script runs on client server, reports:

    foo : 80

    for example.

    They want less than 85 to be green, 85-90 yellow, 90-95 red, and above
    95 any color, say orange.
    So far as I can tell, I can only use green, yellow, and red for
    alerts, and blue and purple are reserved.

 
Currently, no.  But it might help to understand why 4 alert levels are 
desired.

    QUESTION #2:

    lets say #1 above is possible, so my script sends hobbit the status
    line based on the it sees, with the
    status of green, yellow, red, and orange. The hobbit server recieves
    it, and uses the NCV module to build the rrd etc..
    In hobbit-alerts.cfg to say does the SERVICE keyword work for custom
    NCV type columns?


The SERVICE tag in hobbit-alerts.cfg works for any column name, NCV or 
otherwise.
list Gary Baluha · Fri, 18 Jul 2008 14:32:46 -0400 ·
The philosophy Hobbit uses for alerting is that you're okay until you reach
a certain threshold.  At that point (yellow) you still have to respond to
the event and take care of it, before it becomes a bigger issue.  If it
continues, then you reach another threshold where stuff can (and usually
does) break.  At this point, you _need_ to respond to the event.

What you are proposing is a fourth level such that you are "beyond
critical".  This is a similar concept to being "fatally killed" (as opposed
to just being "killed").  The trick to running a successful monitoring
system is setting the thresholds in the first place (which is easier said
than done), such that you don't have any false-positives, but even more
importantly, no false-negatives (i.e. an alert you should have gotten, but
didn't).

Can you give a more specific example (in as far as I.P./security will allow)
of what you are trying to accomplish?

On Fri, Jul 18, 2008 at 11:52 AM, michael nemeth <user-609d3fab5b2d@xymon.invalid>
quoted from Michael Nemeth
wrote:
 One case I can think of is for even 100% you've  lots of but if you hits 0
free you HAVE to do
some thing!

Gary Baluha wrote:

On Fri, Jul 18, 2008 at 10:59 AM, Jeff Newman <user-e96740e73ca8@xymon.invalid>
wrote:
Hi,

didn't see a reply, so thought i'd do a resend in case it got lost in
the shuffle

Hi All,

Two questions:

QUESTION #1: Is it possible to have a third color alert? Meaning:

One of my customers wants a setup like this:

Custom script runs on client server, reports:

foo : 80

for example.

They want less than 85 to be green, 85-90 yellow, 90-95 red, and above
95 any color, say orange.
So far as I can tell, I can only use green, yellow, and red for
alerts, and blue and purple are reserved.

Currently, no.  But it might help to understand why 4 alert levels are
desired.

 QUESTION #2:
lets say #1 above is possible, so my script sends hobbit the status
line based on the it sees, with the
status of green, yellow, red, and orange. The hobbit server recieves
it, and uses the NCV module to build the rrd etc..
In hobbit-alerts.cfg to say does the SERVICE keyword work for custom
NCV type columns?

The SERVICE tag in hobbit-alerts.cfg works for any column name, NCV or
otherwise.

list Michael Nemeth · Fri, 18 Jul 2008 14:57:24 -0400 ·
Sorry, disagree!
I can have gigs of space left at 100%  not critical  at all !!!!  Its not "beyond critical"  its  fatal if you hit zero free !
Either one needs finer granularity (isn't numerical limits in the work)  or a new  fatal color.  I have licenses that run near    100 % all the time too.
quoted from Gary Baluha


Gary Baluha wrote:
The philosophy Hobbit uses for alerting is that you're okay until you reach a certain threshold.  At that point (yellow) you still have to respond to the event and take care of it, before it becomes a bigger issue.  If it continues, then you reach another threshold where stuff can (and usually does) break.  At this point, you _need_ to respond to the event.

What you are proposing is a fourth level such that you are "beyond critical".  This is a similar concept to being "fatally killed" (as opposed to just being "killed").  The trick to running a successful monitoring system is setting the thresholds in the first place (which is easier said than done), such that you don't have any false-positives, but even more importantly, no false-negatives (i.e. an alert you should have gotten, but didn't).

Can you give a more specific example (in as far as I.P./security will allow) of what you are trying to accomplish?

On Fri, Jul 18, 2008 at 11:52 AM, michael nemeth <user-609d3fab5b2d@xymon.invalid <mailto:user-609d3fab5b2d@xymon.invalid>> wrote:

    One case I can think of is for even 100% you've  lots of but if
    you hits 0 free you HAVE to do
    some thing!

    Gary Baluha wrote:
    On Fri, Jul 18, 2008 at 10:59 AM, Jeff Newman
    <user-e96740e73ca8@xymon.invalid <mailto:user-e96740e73ca8@xymon.invalid>> wrote:

        Hi,

        didn't see a reply, so thought i'd do a resend in case it got
        lost in
        the shuffle

        Hi All,

        Two questions:

        QUESTION #1: Is it possible to have a third color alert? Meaning:

        One of my customers wants a setup like this:

        Custom script runs on client server, reports:

        foo : 80

        for example.

        They want less than 85 to be green, 85-90 yellow, 90-95 red,
        and above
        95 any color, say orange.
        So far as I can tell, I can only use green, yellow, and red for
        alerts, and blue and purple are reserved.

         Currently, no.  But it might help to understand why 4 alert
    levels are desired.

        QUESTION #2:

        lets say #1 above is possible, so my script sends hobbit the
        status
        line based on the it sees, with the
        status of green, yellow, red, and orange. The hobbit server
        recieves
        it, and uses the NCV module to build the rrd etc..
        In hobbit-alerts.cfg to say does the SERVICE keyword work for
        custom
        NCV type columns?


    The SERVICE tag in hobbit-alerts.cfg works for any column name,
    NCV or otherwise.
list Gary Baluha · Fri, 18 Jul 2008 15:30:02 -0400 ·
If that's the case, a fourth color would have the same limitation ;-)
(That's a lot of disk space if 100% full = gigs of free space)

With the lack of a finer granularity, the only option you have is to create
a custom script (client-side or server-side should work in this case) that
checks the _amount_ (as opposed to _percentage_) of free space, and set a
green/yellow/red threshold based on that.  You could then set up the Hobbit
alert rules like any other test, and it sounds like this would solve your
particular problem.

(a client-side script would probably be the easiest to set up, depending on
how many machines it would need to be propagated to)

On Fri, Jul 18, 2008 at 2:57 PM, michael nemeth <user-609d3fab5b2d@xymon.invalid>
quoted from Michael Nemeth
wrote:
 Sorry, disagree!
I can have gigs of space left at 100%  not critical  at all !!!!  Its not
"beyond critical"  its  fatal if you hit zero free !
Either one needs finer granularity (isn't numerical limits in the work)  or
a new  fatal color.  I have licenses that run near    100 % all the time
too.


Gary Baluha wrote:

The philosophy Hobbit uses for alerting is that you're okay until you reach
a certain threshold.  At that point (yellow) you still have to respond to
the event and take care of it, before it becomes a bigger issue.  If it
continues, then you reach another threshold where stuff can (and usually
does) break.  At this point, you _need_ to respond to the event.

What you are proposing is a fourth level such that you are "beyond
critical".  This is a similar concept to being "fatally killed" (as opposed
to just being "killed").  The trick to running a successful monitoring
system is setting the thresholds in the first place (which is easier said
than done), such that you don't have any false-positives, but even more
importantly, no false-negatives (i.e. an alert you should have gotten, but
didn't).

Can you give a more specific example (in as far as I.P./security will
allow) of what you are trying to accomplish?

On Fri, Jul 18, 2008 at 11:52 AM, michael nemeth <user-609d3fab5b2d@xymon.invalid>
wrote:
 One case I can think of is for even 100% you've  lots of but if you hits
0 free you HAVE to do
some thing!

Gary Baluha wrote:

On Fri, Jul 18, 2008 at 10:59 AM, Jeff Newman <user-e96740e73ca8@xymon.invalid>
wrote:
Hi,

didn't see a reply, so thought i'd do a resend in case it got lost in
the shuffle

Hi All,

Two questions:

QUESTION #1: Is it possible to have a third color alert? Meaning:

One of my customers wants a setup like this:

Custom script runs on client server, reports:

foo : 80

for example.

They want less than 85 to be green, 85-90 yellow, 90-95 red, and above
95 any color, say orange.
So far as I can tell, I can only use green, yellow, and red for
alerts, and blue and purple are reserved.

Currently, no.  But it might help to understand why 4 alert levels are
desired.

 QUESTION #2:
lets say #1 above is possible, so my script sends hobbit the status
line based on the it sees, with the
status of green, yellow, red, and orange. The hobbit server recieves
it, and uses the NCV module to build the rrd etc..
In hobbit-alerts.cfg to say does the SERVICE keyword work for custom
NCV type columns?

The SERVICE tag in hobbit-alerts.cfg works for any column name, NCV or
otherwise.

list Magnus Carlebjörk · Sat, 19 Jul 2008 07:14:16 +0200 (CEST) ·
Hi,

I also have fairly large filesystems running nominally with just a few gigs 
left. I also run into the percentage problem. Looking through the description in 
hobbit-clients.cfg, there seem to be a solution:

  . . .
#    DISK filesystem IGNORE
#       If the utilization of "filesystem" is reported to exceed "warnlevel"
#       or "paniclevel", the "disk" status will go yellow or red, respectively.
#       "warnlevel" and "paniclevel" are either the percentage used, or the
#       space available as reported by the local "df" command on the host.
#       For the latter type of check, the "warnlevel" must be followed by the
#       letter "U", e.g. "1024U".
  . . .

This would give us the possibility to test on explicit disk space. I tried 
configuring it (windows clients) but never got it right. Anyone else tried or is 
this not implemented yet? Could there be a formatting discrepancy from the 
windows client? I do see a difference between the capacity columns reported. In 
my setup, it is reported  with % from the windows client and with 1k-blocks from 
unix clients. Could this affect the test?

Regards,
--
Magnus Carlebjork
Stockholm, Sweden
+46 76 116 9008
quoted from Gary Baluha


On Fri, 18 Jul 2008, Gary Baluha wrote:
If that's the case, a fourth color would have the same limitation ;-)
(That's a lot of disk space if 100% full = gigs of free space)

With the lack of a finer granularity, the only option you have is to create
a custom script (client-side or server-side should work in this case) that
checks the _amount_ (as opposed to _percentage_) of free space, and set a
green/yellow/red threshold based on that.  You could then set up the Hobbit
alert rules like any other test, and it sounds like this would solve your
particular problem.

(a client-side script would probably be the easiest to set up, depending on
how many machines it would need to be propagated to)

On Fri, Jul 18, 2008 at 2:57 PM, michael nemeth <user-609d3fab5b2d@xymon.invalid>
wrote:
 Sorry, disagree!
I can have gigs of space left at 100%  not critical  at all !!!!  Its not
"beyond critical"  its  fatal if you hit zero free !
Either one needs finer granularity (isn't numerical limits in the work)  or
a new  fatal color.  I have licenses that run near    100 % all the time
too.


Gary Baluha wrote:

The philosophy Hobbit uses for alerting is that you're okay until you reach
a certain threshold.  At that point (yellow) you still have to respond to
the event and take care of it, before it becomes a bigger issue.  If it
continues, then you reach another threshold where stuff can (and usually
does) break.  At this point, you _need_ to respond to the event.

What you are proposing is a fourth level such that you are "beyond
critical".  This is a similar concept to being "fatally killed" (as opposed
to just being "killed").  The trick to running a successful monitoring
system is setting the thresholds in the first place (which is easier said
than done), such that you don't have any false-positives, but even more
importantly, no false-negatives (i.e. an alert you should have gotten, but
didn't).

Can you give a more specific example (in as far as I.P./security will
allow) of what you are trying to accomplish?

On Fri, Jul 18, 2008 at 11:52 AM, michael nemeth <user-609d3fab5b2d@xymon.invalid>
wrote:
 One case I can think of is for even 100% you've  lots of but if you hits
0 free you HAVE to do
some thing!

Gary Baluha wrote:

On Fri, Jul 18, 2008 at 10:59 AM, Jeff Newman <user-e96740e73ca8@xymon.invalid>
wrote:
Hi,

didn't see a reply, so thought i'd do a resend in case it got lost in
the shuffle

Hi All,

Two questions:

QUESTION #1: Is it possible to have a third color alert? Meaning:

One of my customers wants a setup like this:

Custom script runs on client server, reports:

foo : 80

for example.

They want less than 85 to be green, 85-90 yellow, 90-95 red, and above
95 any color, say orange.
So far as I can tell, I can only use green, yellow, and red for
alerts, and blue and purple are reserved.

Currently, no.  But it might help to understand why 4 alert levels are
desired.

 QUESTION #2:
lets say #1 above is possible, so my script sends hobbit the status
line based on the it sees, with the
status of green, yellow, red, and orange. The hobbit server recieves
it, and uses the NCV module to build the rrd etc..
In hobbit-alerts.cfg to say does the SERVICE keyword work for custom
NCV type columns?

The SERVICE tag in hobbit-alerts.cfg works for any column name, NCV or
otherwise.

list Michael Nemeth · Mon, 21 Jul 2008 05:57:51 -0400 ·
Actually the licenses are better example,  Right now I can create 
numeric limits  of say
97-102 yellow,  103 to 121 red,   but have no way of telling when I go 
over.  And that the first quesion
management going to ask, being they are very happy to see there money 
well spent with 100%
utilization.
 My clearcase script DO return rejections.  So with orange I could tell 
management how many times
(at least that) and how long it was orange . Also, of course  try to 
handle the orange condition!

Point is a "Drop Dead, color  is useful .
quoted from Gary Baluha

Gary Baluha wrote:
If that's the case, a fourth color would have the same limitation ;-)
(That's a lot of disk space if 100% full = gigs of free space)

With the lack of a finer granularity, the only option you have is to 
create a custom script (client-side or server-side should work in this 
case) that checks the _amount_ (as opposed to _percentage_) of free 
space, and set a green/yellow/red threshold based on that.  You could 
then set up the Hobbit alert rules like any other test, and it sounds 
like this would solve your particular problem.

(a client-side script would probably be the easiest to set up, 
depending on how many machines it would need to be propagated to)

On Fri, Jul 18, 2008 at 2:57 PM, michael nemeth 
<user-609d3fab5b2d@xymon.invalid <mailto:user-609d3fab5b2d@xymon.invalid>> wrote:

    Sorry, disagree!
    I can have gigs of space left at 100%  not critical  at all !!!! 
    Its not "beyond critical"  its  fatal if you hit zero free !
    Either one needs finer granularity (isn't numerical limits in the
    work)  or a new  fatal color.  I have that run near    100 % all
    the time too.


    Gary Baluha wrote:
    The philosophy Hobbit uses for alerting is that you're okay until
    you reach a certain threshold.  At that point (yellow) you still
    have to respond to the event and take care of it, before it
    becomes a bigger issue.  If it continues, then you reach another
    threshold where stuff can (and usually does) break.  At this
    point, you _need_ to respond to the event.

    What you are proposing is a fourth level such that you are
    "beyond critical".  This is a similar concept to being "fatally
    killed" (as opposed to just being "killed").  The trick to
    running a successful monitoring system is setting the thresholds
    in the first place (which is easier said than done), such that
    you don't have any false-positives, but even more importantly, no
    false-negatives (i.e. an alert you should have gotten, but didn't).

    Can you give a more specific example (in as far as I.P./security
    will allow) of what you are trying to accomplish?

    On Fri, Jul 18, 2008 at 11:52 AM, michael nemeth
    <user-609d3fab5b2d@xymon.invalid <mailto:user-609d3fab5b2d@xymon.invalid>> wrote:

        One case I can think of is for even 100% you've  lots of but
        if you hits 0 free you HAVE to do
        some thing!

        Gary Baluha wrote:
        On Fri, Jul 18, 2008 at 10:59 AM, Jeff Newman
        <user-e96740e73ca8@xymon.invalid <mailto:user-e96740e73ca8@xymon.invalid>> wrote:

            Hi,

            didn't see a reply, so thought i'd do a resend in case
            it got lost in
            the shuffle

            Hi All,

            Two questions:

            QUESTION #1: Is it possible to have a third color alert?
            Meaning:

            One of my customers wants a setup like this:

            Custom script runs on client server, reports:

            foo : 80

            for example.

            They want less than 85 to be green, 85-90 yellow, 90-95
            red, and above
            95 any color, say orange.
            So far as I can tell, I can only use green, yellow, and
            red for
            alerts, and blue and purple are reserved.

         
        Currently, no.  But it might help to understand why 4 alert
        levels are desired.

            QUESTION #2:

            lets say #1 above is possible, so my script sends hobbit
            the status
            line based on the it sees, with the
            status of green, yellow, red, and orange. The hobbit
            server recieves
            it, and uses the NCV module to build the rrd etc..
            In hobbit-alerts.cfg to say does the SERVICE keyword
            work for custom
            NCV type columns?


        The SERVICE tag in hobbit-alerts.cfg works for any column
        name, NCV or otherwise.
list Jeff Newman · Tue, 22 Jul 2008 10:57:45 -0500 ·
Right. I think the concept is

Level 1: "warning everyone, something bad could happen, or might not,
may want to look"
                  - Yellow
Level 2: "Hey look, it was just a warning before, but now, it's bad
and service might
              be interrupted unless you take action, this is your last
chance buddy!"
                  - Red
Level 3: "I've told you repeatedly, and now look whats happened! You've reached
             super critical orange level! That means within minutes
your service will be dead.
             run for the hills, the sky is falling, the phone is about
to ring non-stop"
                  - Orange

i think 3 levels makes sense for some specific applications.

-jeff
quoted from Michael Nemeth

On Mon, Jul 21, 2008 at 4:57 AM, michael nemeth <user-609d3fab5b2d@xymon.invalid> wrote:
Actually the licenses are better example,  Right now I can create numeric
limits  of say
97-102 yellow,  103 to 121 red,   but have no way of telling when I go
over.  And that the first quesion
management going to ask, being they are very happy to see there money well
spent with 100%
utilization.
 My clearcase script DO return rejections.  So with orange I could tell
management how many times
(at least that) and how long it was orange . Also, of course  try to handle
the orange condition!

Point is a "Drop Dead, color  is useful .

Gary Baluha wrote:

If that's the case, a fourth color would have the same limitation ;-)
(That's a lot of disk space if 100% full = gigs of free space)

With the lack of a finer granularity, the only option you have is to create
a custom script (client-side or server-side should work in this case) that
checks the _amount_ (as opposed to _percentage_) of free space, and set a
green/yellow/red threshold based on that.  You could then set up the Hobbit
alert rules like any other test, and it sounds like this would solve your
particular problem.

(a client-side script would probably be the easiest to set up, depending on
how many machines it would need to be propagated to)

On Fri, Jul 18, 2008 at 2:57 PM, michael nemeth <user-609d3fab5b2d@xymon.invalid>
wrote:
Sorry, disagree!
I can have gigs of space left at 100%  not critical  at all !!!!  Its not
"beyond critical"  its  fatal if you hit zero free !
Either one needs finer granularity (isn't numerical limits in the work)
or a new  fatal color.  I have that run near    100 % all the time too.


Gary Baluha wrote:

The philosophy Hobbit uses for alerting is that you're okay until you
reach a certain threshold.  At that point (yellow) you still have to respond
to the event and take care of it, before it becomes a bigger issue.  If it
continues, then you reach another threshold where stuff can (and usually
does) break.  At this point, you _need_ to respond to the event.

What you are proposing is a fourth level such that you are "beyond
critical".  This is a similar concept to being "fatally killed" (as opposed
to just being "killed").  The trick to running a successful monitoring
system is setting the thresholds in the first place (which is easier said
than done), such that you don't have any false-positives, but even more
importantly, no false-negatives (i.e. an alert you should have gotten, but
didn't).

Can you give a more specific example (in as far as I.P./security will
allow) of what you are trying to accomplish?

On Fri, Jul 18, 2008 at 11:52 AM, michael nemeth <user-609d3fab5b2d@xymon.invalid>
wrote:
One case I can think of is for even 100% you've  lots of but if you hits
0 free you HAVE to do
some thing!

Gary Baluha wrote:

On Fri, Jul 18, 2008 at 10:59 AM, Jeff Newman <user-e96740e73ca8@xymon.invalid>
wrote:
Hi,

didn't see a reply, so thought i'd do a resend in case it got lost in
the shuffle

Hi All,

Two questions:

QUESTION #1: Is it possible to have a third color alert? Meaning:

One of my customers wants a setup like this:

Custom script runs on client server, reports:

foo : 80

for example.

They want less than 85 to be green, 85-90 yellow, 90-95 red, and above
95 any color, say orange.
So far as I can tell, I can only use green, yellow, and red for
alerts, and blue and purple are reserved.

Currently, no.  But it might help to understand why 4 alert levels are
desired.
QUESTION #2:

lets say #1 above is possible, so my script sends hobbit the status
line based on the it sees, with the
status of green, yellow, red, and orange. The hobbit server recieves
it, and uses the NCV module to build the rrd etc..
In hobbit-alerts.cfg to say does the SERVICE keyword work for custom
NCV type columns?
The SERVICE tag in hobbit-alerts.cfg works for any column name, NCV or
otherwise.

list Tom Kauffman · Tue, 22 Jul 2008 13:40:25 -0400 ·
Is the format of the 'status' message for the memory test documented anywhere?

I'm trying to get the memory available (or used) from a P6 multi-lpar system to generate the built-in graphs.

I'm getting the rrd created, but it has no values.

TIA -

Tom Kauffman
NIBCO, Inc
CONFIDENTIALITY NOTICE:  This email and any attachments are for the exclusive and confidential use of the intended recipient.  If you are not
the intended recipient, please do not read, distribute or take action in reliance upon this message. If you have received this in error, please notify us immediately by return email and promptly delete this message and its attachments from your computer system. We do not waive  attorney-client or work product privilege by the transmission of this
message.
list Gary Baluha · Tue, 22 Jul 2008 14:18:44 -0400 ·
I say the color should be brown, then...

On Tue, Jul 22, 2008 at 11:57 AM, Jeff Newman <user-e96740e73ca8@xymon.invalid>
quoted from Jeff Newman
wrote:
Right. I think the concept is

Level 1: "warning everyone, something bad could happen, or might not,
may want to look"
                 - Yellow
Level 2: "Hey look, it was just a warning before, but now, it's bad
and service might
             be interrupted unless you take action, this is your last
chance buddy!"
                 - Red
Level 3: "I've told you repeatedly, and now look whats happened! You've
reached
            super critical orange level! That means within minutes
your service will be dead.
            run for the hills, the sky is falling, the phone is about
to ring non-stop"
                 - Orange

i think 3 levels makes sense for some specific applications.

-jeff

On Mon, Jul 21, 2008 at 4:57 AM, michael nemeth <user-609d3fab5b2d@xymon.invalid>
wrote:
Actually the licenses are better example,  Right now I can create numeric
limits  of say
97-102 yellow,  103 to 121 red,   but have no way of telling when I go
over.  And that the first quesion
management going to ask, being they are very happy to see there money
well
spent with 100%
utilization.
 My clearcase script DO return rejections.  So with orange I could tell
management how many times
(at least that) and how long it was orange . Also, of course  try to
handle
the orange condition!

Point is a "Drop Dead, color  is useful .

Gary Baluha wrote:

If that's the case, a fourth color would have the same limitation ;-)
(That's a lot of disk space if 100% full = gigs of free space)

With the lack of a finer granularity, the only option you have is to
create
a custom script (client-side or server-side should work in this case)
that
checks the _amount_ (as opposed to _percentage_) of free space, and set a
green/yellow/red threshold based on that.  You could then set up the
Hobbit
alert rules like any other test, and it sounds like this would solve your
particular problem.

(a client-side script would probably be the easiest to set up, depending
on
how many machines it would need to be propagated to)

On Fri, Jul 18, 2008 at 2:57 PM, michael nemeth <user-609d3fab5b2d@xymon.invalid

wrote:
Sorry, disagree!
I can have gigs of space left at 100%  not critical  at all !!!!  Its
not
"beyond critical"  its  fatal if you hit zero free !
Either one needs finer granularity (isn't numerical limits in the work)
or a new  fatal color.  I have that run near    100 % all the time too.


Gary Baluha wrote:

The philosophy Hobbit uses for alerting is that you're okay until you
reach a certain threshold.  At that point (yellow) you still have to
respond
to the event and take care of it, before it becomes a bigger issue.  If
it
continues, then you reach another threshold where stuff can (and usually
does) break.  At this point, you _need_ to respond to the event.

What you are proposing is a fourth level such that you are "beyond
critical".  This is a similar concept to being "fatally killed" (as
opposed
to just being "killed").  The trick to running a successful monitoring
system is setting the thresholds in the first place (which is easier
said
than done), such that you don't have any false-positives, but even more
importantly, no false-negatives (i.e. an alert you should have gotten,
but
didn't).

Can you give a more specific example (in as far as I.P./security will
allow) of what you are trying to accomplish?

On Fri, Jul 18, 2008 at 11:52 AM, michael nemeth <
user-609d3fab5b2d@xymon.invalid>
wrote:
One case I can think of is for even 100% you've  lots of but if you
hits
0 free you HAVE to do
some thing!

Gary Baluha wrote:

On Fri, Jul 18, 2008 at 10:59 AM, Jeff Newman <user-e96740e73ca8@xymon.invalid>
wrote:
Hi,

didn't see a reply, so thought i'd do a resend in case it got lost in
the shuffle

Hi All,

Two questions:

QUESTION #1: Is it possible to have a third color alert? Meaning:

One of my customers wants a setup like this:

Custom script runs on client server, reports:

foo : 80

for example.

They want less than 85 to be green, 85-90 yellow, 90-95 red, and above
95 any color, say orange.
So far as I can tell, I can only use green, yellow, and red for
alerts, and blue and purple are reserved.

Currently, no.  But it might help to understand why 4 alert levels are
desired.
QUESTION #2:

lets say #1 above is possible, so my script sends hobbit the status
line based on the it sees, with the
status of green, yellow, red, and orange. The hobbit server recieves
it, and uses the NCV module to build the rrd etc..
In hobbit-alerts.cfg to say does the SERVICE keyword work for custom
NCV type columns?
The SERVICE tag in hobbit-alerts.cfg works for any column name, NCV or
otherwise.

list Michael Nemeth · Wed, 23 Jul 2008 06:20:03 -0400 ·
Yes that it.  And as I said then I can track the orange Status for 
management.
quoted from Jeff Newman
Jeff Newman wrote:
Right. I think the concept is

Level 1: "warning everyone, something bad could happen, or might not,
may want to look"
                  - Yellow
Level 2: "Hey look, it was just a warning before, but now, it's bad
and service might
              be interrupted unless you take action, this is your last
chance buddy!"
                  - Red
Level 3: "I've told you repeatedly, and now look whats happened! You've reached
             super critical orange level! That means within minutes
your service will be dead.
             run for the hills, the sky is falling, the phone is about
to ring non-stop"
                  - Orange

i think 3 levels makes sense for some specific applications.

-jeff

On Mon, Jul 21, 2008 at 4:57 AM, michael nemeth <user-609d3fab5b2d@xymon.invalid> wrote:
  
Actually the licenses are better example,  Right now I can create numeric
limits  of say
97-102 yellow,  103 to 121 red,   but have no way of telling when I go
over.  And that the first quesion
management going to ask, being they are very happy to see there money well
spent with 100%
utilization.
 My clearcase script DO return rejections.  So with orange I could tell
management how many times
(at least that) and how long it was orange . Also, of course  try to handle
the orange condition!

Point is a "Drop Dead, color  is useful .

Gary Baluha wrote:

If that's the case, a fourth color would have the same limitation ;-)
(That's a lot of disk space if 100% full = gigs of free space)

With the lack of a finer granularity, the only option you have is to create
a custom script (client-side or server-side should work in this case) that
checks the _amount_ (as opposed to _percentage_) of free space, and set a
green/yellow/red threshold based on that.  You could then set up the Hobbit
alert rules like any other test, and it sounds like this would solve your
particular problem.

(a client-side script would probably be the easiest to set up, depending on
how many machines it would need to be propagated to)

On Fri, Jul 18, 2008 at 2:57 PM, michael nemeth <user-609d3fab5b2d@xymon.invalid>
wrote:
    
Sorry, disagree!
I can have gigs of space left at 100%  not critical  at all !!!!  Its not
"beyond critical"  its  fatal if you hit zero free !
Either one needs finer granularity (isn't numerical limits in the work)
or a new  fatal color.  I have that run near    100 % all the time too.


Gary Baluha wrote:

The philosophy Hobbit uses for alerting is that you're okay until you
reach a certain threshold.  At that point (yellow) you still have to respond
to the event and take care of it, before it becomes a bigger issue.  If it
continues, then you reach another threshold where stuff can (and usually
does) break.  At this point, you _need_ to respond to the event.

What you are proposing is a fourth level such that you are "beyond
critical".  This is a similar concept to being "fatally killed" (as opposed
to just being "killed").  The trick to running a successful monitoring
system is setting the thresholds in the first place (which is easier said
than done), such that you don't have any false-positives, but even more
importantly, no false-negatives (i.e. an alert you should have gotten, but
didn't).

Can you give a more specific example (in as far as I.P./security will
allow) of what you are trying to accomplish?

On Fri, Jul 18, 2008 at 11:52 AM, michael nemeth <user-609d3fab5b2d@xymon.invalid>
wrote:
      
One case I can think of is for even 100% you've  lots of but if you hits
0 free you HAVE to do
some thing!

Gary Baluha wrote:

On Fri, Jul 18, 2008 at 10:59 AM, Jeff Newman <user-e96740e73ca8@xymon.invalid>
wrote:
        
Hi,

didn't see a reply, so thought i'd do a resend in case it got lost in
the shuffle

Hi All,

Two questions:

QUESTION #1: Is it possible to have a third color alert? Meaning:

One of my customers wants a setup like this:

Custom script runs on client server, reports:

foo : 80

for example.

They want less than 85 to be green, 85-90 yellow, 90-95 red, and above
95 any color, say orange.
So far as I can tell, I can only use green, yellow, and red for
alerts, and blue and purple are reserved.
          
Currently, no.  But it might help to understand why 4 alert levels are
desired.

        
QUESTION #2:

lets say #1 above is possible, so my script sends hobbit the status
line based on the it sees, with the
status of green, yellow, red, and orange. The hobbit server recieves
it, and uses the NCV module to build the rrd etc..
In hobbit-alerts.cfg to say does the SERVICE keyword work for custom
NCV type columns?
          
The SERVICE tag in hobbit-alerts.cfg works for any column name, NCV or
otherwise.

list Michael Nemeth · Wed, 23 Jul 2008 06:22:31 -0400 ·
Yes you're right IT does hit the fan then.
quoted from Gary Baluha

Gary Baluha wrote:
I say the color should be brown, then...

On Tue, Jul 22, 2008 at 11:57 AM, Jeff Newman <user-e96740e73ca8@xymon.invalid 
<mailto:user-e96740e73ca8@xymon.invalid>> wrote:

    Right. I think the concept is

    Level 1: "warning everyone, something bad could happen, or might not,
    may want to look"
                     - Yellow
    Level 2: "Hey look, it was just a warning before, but now, it's bad
    and service might
                 be interrupted unless you take action, this is your last
    chance buddy!"
                     - Red
    Level 3: "I've told you repeatedly, and now look whats happened!
    You've reached
                super critical orange level! That means within minutes
    your service will be dead.
                run for the hills, the sky is falling, the phone is about
    to ring non-stop"
                     - Orange

    i think 3 levels makes sense for some specific applications.

    -jeff

    On Mon, Jul 21, 2008 at 4:57 AM, michael nemeth
    <user-609d3fab5b2d@xymon.invalid <mailto:user-609d3fab5b2d@xymon.invalid>> wrote:
Actually the licenses are better example,  Right now I can
    create numeric
limits  of say
97-102 yellow,  103 to 121 red,   but have no way of telling
    when I go
over.  And that the first quesion
management going to ask, being they are very happy to see there
    money well
spent with 100%
utilization.
 My clearcase script DO return rejections.  So with orange I
    could tell
management how many times
(at least that) and how long it was orange . Also, of course
     try to handle
the orange condition!

Point is a "Drop Dead, color  is useful .

Gary Baluha wrote:

If that's the case, a fourth color would have the same
    limitation ;-)
(That's a lot of disk space if 100% full = gigs of free space)

With the lack of a finer granularity, the only option you have
    is to create
a custom script (client-side or server-side should work in this
    case) that
checks the _amount_ (as opposed to _percentage_) of free space,
    and set a
green/yellow/red threshold based on that.  You could then set up
    the Hobbit
alert rules like any other test, and it sounds like this would
    solve your
particular problem.

(a client-side script would probably be the easiest to set up,
    depending on
how many machines it would need to be propagated to)

On Fri, Jul 18, 2008 at 2:57 PM, michael nemeth
    <user-609d3fab5b2d@xymon.invalid <mailto:user-609d3fab5b2d@xymon.invalid>>
wrote:
Sorry, disagree!
I can have gigs of space left at 100%  not critical  at all
    !!!!  Its not
"beyond critical"  its  fatal if you hit zero free !
Either one needs finer granularity (isn't numerical limits in
    the work)
or a new  fatal color.  I have that run near    100 % all the
    time too.


Gary Baluha wrote:

The philosophy Hobbit uses for alerting is that you're okay
    until you
reach a certain threshold.  At that point (yellow) you still
    have to respond
to the event and take care of it, before it becomes a bigger
    issue.  If it
continues, then you reach another threshold where stuff can
    (and usually
does) break.  At this point, you _need_ to respond to the event.

What you are proposing is a fourth level such that you are "beyond
critical".  This is a similar concept to being "fatally killed"
    (as opposed
to just being "killed").  The trick to running a successful
    monitoring
system is setting the thresholds in the first place (which is
    easier said
than done), such that you don't have any false-positives, but
    even more
importantly, no false-negatives (i.e. an alert you should have
    gotten, but
didn't).

Can you give a more specific example (in as far as
    I.P./security will
allow) of what you are trying to accomplish?

On Fri, Jul 18, 2008 at 11:52 AM, michael nemeth
    <user-609d3fab5b2d@xymon.invalid <mailto:user-609d3fab5b2d@xymon.invalid>>
wrote:
One case I can think of is for even 100% you've  lots of but
    if you hits
0 free you HAVE to do
some thing!

Gary Baluha wrote:

On Fri, Jul 18, 2008 at 10:59 AM, Jeff Newman
    <user-e96740e73ca8@xymon.invalid <mailto:user-e96740e73ca8@xymon.invalid>>
wrote:
Hi,

didn't see a reply, so thought i'd do a resend in case it got
    lost in
the shuffle

Hi All,

Two questions:

QUESTION #1: Is it possible to have a third color alert? Meaning:

One of my customers wants a setup like this:

Custom script runs on client server, reports:

foo : 80

for example.

They want less than 85 to be green, 85-90 yellow, 90-95 red,
    and above
95 any color, say orange.
So far as I can tell, I can only use green, yellow, and red for
alerts, and blue and purple are reserved.

Currently, no.  But it might help to understand why 4 alert
    levels are
desired.
QUESTION #2:

lets say #1 above is possible, so my script sends hobbit the
    status
line based on the it sees, with the
status of green, yellow, red, and orange. The hobbit server
    recieves
it, and uses the NCV module to build the rrd etc..
In hobbit-alerts.cfg to say does the SERVICE keyword work for
    custom
NCV type columns?
The SERVICE tag in hobbit-alerts.cfg works for any column
    name, NCV or
otherwise.

list Dan McDonald · Wed, 23 Jul 2008 07:15:50 -0500 ·
quoted from Michael Nemeth
On Wed, 2008-07-23 at 06:22 -0400, michael nemeth wrote:
Yes you're right IT does hit the fan then.

Gary Baluha wrote: 
I say the color should be brown, then...
Brown and orange are the same hue, just different intensities :-)

But I would also like a "hey it's really, really bad" notification.  We
page when there is 30 minutes left on a UPS.  I'd like to page again
when there is 5 minutes left (and it's not always 25 minutes later...)

Another instance is that we page when it is over 80 degrees F in a comm
room.  I'd like to page again when it gets to 100...

I know I could write additional tests and have columns for both upsmin
and upsfailimminent, but a third threshold would be easier to
maintain...
quoted from Michael Nemeth
On Tue, Jul 22, 2008 at 11:57 AM, Jeff Newman
<user-e96740e73ca8@xymon.invalid> wrote:
        Right. I think the concept is
        
        Level 1: "warning everyone, something bad could happen, or
        might not,
        may want to look"
                         - Yellow
        Level 2: "Hey look, it was just a warning before, but now,
        it's bad
        and service might
                     be interrupted unless you take action, this is
        your last
        chance buddy!"
                         - Red
        Level 3: "I've told you repeatedly, and now look whats
        happened! You've reached
                    super critical orange level! That means within
        minutes
        your service will be dead.
                    run for the hills, the sky is falling, the phone
        is about
        to ring non-stop"
                         - Orange
        
        i think 3 levels makes sense for some specific applications.
        
-- 

Daniel J McDonald, CCIE #2495, CISSP #78281, CNX
Austin Energy
http://www.austinenergy.com
list Rdeal · Wed, 23 Jul 2008 08:35:45 -0400 ·
You could page on yellow at 80F and then again at red when the room comes to
100F...

Same with the ups, yellow is 30 minutes left on UPS, red is 5 minutes
left...
You wouldn't want to see any of these on anything other than green.
quoted from Dan McDonald

From: "McDonald, Dan" <user-290ce4e24e19@xymon.invalid>
Reply-To: <user-ae9b8668bcde@xymon.invalid>
Date: Wed, 23 Jul 2008 07:15:50 -0500
To: <user-ae9b8668bcde@xymon.invalid>
Subject: Re: [hobbit] resend: 2 questions

On Wed, 2008-07-23 at 06:22 -0400, michael nemeth wrote:
Yes you're right IT does hit the fan then.

Gary Baluha wrote:
I say the color should be brown, then...
Brown and orange are the same hue, just different intensities :-)

But I would also like a "hey it's really, really bad" notification.  We
page when there is 30 minutes left on a UPS.  I'd like to page again
when there is 5 minutes left (and it's not always 25 minutes later...)

Another instance is that we page when it is over 80 degrees F in a comm
room.  I'd like to page again when it gets to 100...

I know I could write additional tests and have columns for both upsmin
and upsfailimminent, but a third threshold would be easier to
maintain...
On Tue, Jul 22, 2008 at 11:57 AM, Jeff Newman
<user-e96740e73ca8@xymon.invalid> wrote:
        Right. I think the concept is
        
        Level 1: "warning everyone, something bad could happen, or
        might not,
        may want to look"
                         - Yellow
        Level 2: "Hey look, it was just a warning before, but now,
        it's bad
        and service might
                     be interrupted unless you take action, this is
        your last
        chance buddy!"
                         - Red
        Level 3: "I've told you repeatedly, and now look whats
        happened! You've reached
                    super critical orange level! That means within
        minutes
        your service will be dead.
                    run for the hills, the sky is falling, the phone
        is about
        to ring non-stop"
                         - Orange
        
        i think 3 levels makes sense for some specific applications.
        
-- 
Daniel J McDonald, CCIE #2495, CISSP #78281, CNX
Austin Energy
http://www.austinenergy.com
list Jeff Newman · Wed, 23 Jul 2008 10:09:16 -0500 ·
Well, the first step is getting hobbit to recognize and act upon a
third color (be it brown or orange)

I'm not a good enough coder to do this.

-Jeff


On Wed, Jul 23, 2008 at 7:15 AM, McDonald, Dan
quoted from Rdeal
<user-290ce4e24e19@xymon.invalid> wrote:
On Wed, 2008-07-23 at 06:22 -0400, michael nemeth wrote:
Yes you're right IT does hit the fan then.

Gary Baluha wrote:
I say the color should be brown, then...
Brown and orange are the same hue, just different intensities :-)

But I would also like a "hey it's really, really bad" notification.  We
page when there is 30 minutes left on a UPS.  I'd like to page again
when there is 5 minutes left (and it's not always 25 minutes later...)

Another instance is that we page when it is over 80 degrees F in a comm
room.  I'd like to page again when it gets to 100...

I know I could write additional tests and have columns for both upsmin
and upsfailimminent, but a third threshold would be easier to
maintain...
On Tue, Jul 22, 2008 at 11:57 AM, Jeff Newman
<user-e96740e73ca8@xymon.invalid> wrote:
        Right. I think the concept is

        Level 1: "warning everyone, something bad could happen, or
        might not,
        may want to look"
                         - Yellow
        Level 2: "Hey look, it was just a warning before, but now,
        it's bad
        and service might
                     be interrupted unless you take action, this is
        your last
        chance buddy!"
                         - Red
        Level 3: "I've told you repeatedly, and now look whats
        happened! You've reached
                    super critical orange level! That means within
        minutes
        your service will be dead.
                    run for the hills, the sky is falling, the phone
        is about
        to ring non-stop"
                         - Orange

        i think 3 levels makes sense for some specific applications.
--
Daniel J McDonald, CCIE #2495, CISSP #78281, CNX
Austin Energy
http://www.austinenergy.com

list Jeff Newman · Mon, 28 Jul 2008 14:20:24 -0500 ·
We know this. I don't think the question is why have 3 colors, others
have expressed desire for a third color, the question is how.
quoted from Rdeal


On Wed, Jul 23, 2008 at 7:35 AM, rdeal <user-a44af7422b8a@xymon.invalid> wrote:
You could page on yellow at 80F and then again at red when the room comes to
100F...

Same with the ups, yellow is 30 minutes left on UPS, red is 5 minutes
left...
You wouldn't want to see any of these on anything other than green.

From: "McDonald, Dan" <user-290ce4e24e19@xymon.invalid>
Reply-To: <user-ae9b8668bcde@xymon.invalid>
Date: Wed, 23 Jul 2008 07:15:50 -0500
To: <user-ae9b8668bcde@xymon.invalid>
Subject: Re: [hobbit] resend: 2 questions

On Wed, 2008-07-23 at 06:22 -0400, michael nemeth wrote:
Yes you're right IT does hit the fan then.

Gary Baluha wrote:
I say the color should be brown, then...
Brown and orange are the same hue, just different intensities :-)

But I would also like a "hey it's really, really bad" notification.  We
page when there is 30 minutes left on a UPS.  I'd like to page again
when there is 5 minutes left (and it's not always 25 minutes later...)

Another instance is that we page when it is over 80 degrees F in a comm
room.  I'd like to page again when it gets to 100...

I know I could write additional tests and have columns for both upsmin
and upsfailimminent, but a third threshold would be easier to
maintain...
On Tue, Jul 22, 2008 at 11:57 AM, Jeff Newman
<user-e96740e73ca8@xymon.invalid> wrote:
        Right. I think the concept is

        Level 1: "warning everyone, something bad could happen, or
        might not,
        may want to look"
                         - Yellow
        Level 2: "Hey look, it was just a warning before, but now,
        it's bad
        and service might
                     be interrupted unless you take action, this is
        your last
        chance buddy!"
                         - Red
        Level 3: "I've told you repeatedly, and now look whats
        happened! You've reached
                    super critical orange level! That means within
        minutes
        your service will be dead.
                    run for the hills, the sky is falling, the phone
        is about
        to ring non-stop"
                         - Orange

        i think 3 levels makes sense for some specific applications.
--
Daniel J McDonald, CCIE #2495, CISSP #78281, CNX
Austin Energy
http://www.austinenergy.com