resend: 2 questions
list Jeff Newman
Hi, didn't see a reply, so thought i'd do a resend in case it got lost in the shuffle Hi All, Two questions: QUESTION #1: Is it possible to have a third color alert? Meaning: One of my customers wants a setup like this: Custom script runs on client server, reports: foo : 80 for example. They want less than 85 to be green, 85-90 yellow, 90-95 red, and above 95 any color, say orange. So far as I can tell, I can only use green, yellow, and red for alerts, and blue and purple are reserved. QUESTION #2: lets say #1 above is possible, so my script sends hobbit the status line based on the it sees, with the status of green, yellow, red, and orange. The hobbit server recieves it, and uses the NCV module to build the rrd etc.. In hobbit-alerts.cfg to say does the SERVICE keyword work for custom NCV type columns? Thanks.
list Gary Baluha
On Fri, Jul 18, 2008 at 10:59 AM, Jeff Newman <user-e96740e73ca8@xymon.invalid>
▸
wrote:
Hi, didn't see a reply, so thought i'd do a resend in case it got lost in the shuffle Hi All, Two questions: QUESTION #1: Is it possible to have a third color alert? Meaning: One of my customers wants a setup like this: Custom script runs on client server, reports: foo : 80 for example. They want less than 85 to be green, 85-90 yellow, 90-95 red, and above 95 any color, say orange. So far as I can tell, I can only use green, yellow, and red for alerts, and blue and purple are reserved.
Currently, no. But it might help to understand why 4 alert levels are
desired.
▸
QUESTION #2:lets say #1 above is possible, so my script sends hobbit the status line based on the it sees, with the status of green, yellow, red, and orange. The hobbit server recieves it, and uses the NCV module to build the rrd etc.. In hobbit-alerts.cfg to say does the SERVICE keyword work for custom NCV type columns?
The SERVICE tag in hobbit-alerts.cfg works for any column name, NCV or
otherwise.
list Michael Nemeth
One case I can think of is for even 100% you've lots of but if you hits 0 free you HAVE to do some thing!
▸
Gary Baluha wrote:On Fri, Jul 18, 2008 at 10:59 AM, Jeff Newman <user-e96740e73ca8@xymon.invalid
<mailto:user-e96740e73ca8@xymon.invalid>> wrote:
Hi,
didn't see a reply, so thought i'd do a resend in case it got lost in
the shuffle
Hi All,
Two questions:
QUESTION #1: Is it possible to have a third color alert? Meaning:
One of my customers wants a setup like this:
Custom script runs on client server, reports:
foo : 80
for example.
They want less than 85 to be green, 85-90 yellow, 90-95 red, and above
95 any color, say orange.
So far as I can tell, I can only use green, yellow, and red for
alerts, and blue and purple are reserved.
Currently, no. But it might help to understand why 4 alert levels are
desired.
QUESTION #2:
lets say #1 above is possible, so my script sends hobbit the status
line based on the it sees, with the
status of green, yellow, red, and orange. The hobbit server recieves
it, and uses the NCV module to build the rrd etc..
In hobbit-alerts.cfg to say does the SERVICE keyword work for custom
NCV type columns?
The SERVICE tag in hobbit-alerts.cfg works for any column name, NCV or
otherwise.
list Gary Baluha
The philosophy Hobbit uses for alerting is that you're okay until you reach a certain threshold. At that point (yellow) you still have to respond to the event and take care of it, before it becomes a bigger issue. If it continues, then you reach another threshold where stuff can (and usually does) break. At this point, you _need_ to respond to the event. What you are proposing is a fourth level such that you are "beyond critical". This is a similar concept to being "fatally killed" (as opposed to just being "killed"). The trick to running a successful monitoring system is setting the thresholds in the first place (which is easier said than done), such that you don't have any false-positives, but even more importantly, no false-negatives (i.e. an alert you should have gotten, but didn't). Can you give a more specific example (in as far as I.P./security will allow) of what you are trying to accomplish? On Fri, Jul 18, 2008 at 11:52 AM, michael nemeth <user-609d3fab5b2d@xymon.invalid>
▸
wrote:
One case I can think of is for even 100% you've lots of but if you hits 0 free you HAVE to do some thing! Gary Baluha wrote: On Fri, Jul 18, 2008 at 10:59 AM, Jeff Newman <user-e96740e73ca8@xymon.invalid> wrote:Hi, didn't see a reply, so thought i'd do a resend in case it got lost in the shuffle Hi All, Two questions: QUESTION #1: Is it possible to have a third color alert? Meaning: One of my customers wants a setup like this: Custom script runs on client server, reports: foo : 80 for example. They want less than 85 to be green, 85-90 yellow, 90-95 red, and above 95 any color, say orange. So far as I can tell, I can only use green, yellow, and red for alerts, and blue and purple are reserved.Currently, no. But it might help to understand why 4 alert levels are desired. QUESTION #2:lets say #1 above is possible, so my script sends hobbit the status line based on the it sees, with the status of green, yellow, red, and orange. The hobbit server recieves it, and uses the NCV module to build the rrd etc.. In hobbit-alerts.cfg to say does the SERVICE keyword work for custom NCV type columns?The SERVICE tag in hobbit-alerts.cfg works for any column name, NCV or otherwise.
list Michael Nemeth
Sorry, disagree! I can have gigs of space left at 100% not critical at all !!!! Its not "beyond critical" its fatal if you hit zero free ! Either one needs finer granularity (isn't numerical limits in the work) or a new fatal color. I have licenses that run near 100 % all the time too.
▸
Gary Baluha wrote:The philosophy Hobbit uses for alerting is that you're okay until you reach a certain threshold. At that point (yellow) you still have to respond to the event and take care of it, before it becomes a bigger issue. If it continues, then you reach another threshold where stuff can (and usually does) break. At this point, you _need_ to respond to the event. What you are proposing is a fourth level such that you are "beyond critical". This is a similar concept to being "fatally killed" (as opposed to just being "killed"). The trick to running a successful monitoring system is setting the thresholds in the first place (which is easier said than done), such that you don't have any false-positives, but even more importantly, no false-negatives (i.e. an alert you should have gotten, but didn't). Can you give a more specific example (in as far as I.P./security will allow) of what you are trying to accomplish? On Fri, Jul 18, 2008 at 11:52 AM, michael nemeth <user-609d3fab5b2d@xymon.invalid <mailto:user-609d3fab5b2d@xymon.invalid>> wrote: One case I can think of is for even 100% you've lots of but if you hits 0 free you HAVE to do some thing! Gary Baluha wrote:On Fri, Jul 18, 2008 at 10:59 AM, Jeff Newman <user-e96740e73ca8@xymon.invalid <mailto:user-e96740e73ca8@xymon.invalid>> wrote: Hi, didn't see a reply, so thought i'd do a resend in case it got lost in the shuffle Hi All, Two questions: QUESTION #1: Is it possible to have a third color alert? Meaning: One of my customers wants a setup like this: Custom script runs on client server, reports: foo : 80 for example. They want less than 85 to be green, 85-90 yellow, 90-95 red, and above 95 any color, say orange. So far as I can tell, I can only use green, yellow, and red for alerts, and blue and purple are reserved. Currently, no. But it might help to understand why 4 alert levels are desired. QUESTION #2: lets say #1 above is possible, so my script sends hobbit the status line based on the it sees, with the status of green, yellow, red, and orange. The hobbit server recieves it, and uses the NCV module to build the rrd etc.. In hobbit-alerts.cfg to say does the SERVICE keyword work for custom NCV type columns? The SERVICE tag in hobbit-alerts.cfg works for any column name, NCV or otherwise.
list Gary Baluha
If that's the case, a fourth color would have the same limitation ;-) (That's a lot of disk space if 100% full = gigs of free space) With the lack of a finer granularity, the only option you have is to create a custom script (client-side or server-side should work in this case) that checks the _amount_ (as opposed to _percentage_) of free space, and set a green/yellow/red threshold based on that. You could then set up the Hobbit alert rules like any other test, and it sounds like this would solve your particular problem. (a client-side script would probably be the easiest to set up, depending on how many machines it would need to be propagated to) On Fri, Jul 18, 2008 at 2:57 PM, michael nemeth <user-609d3fab5b2d@xymon.invalid>
▸
wrote:
Sorry, disagree! I can have gigs of space left at 100% not critical at all !!!! Its not "beyond critical" its fatal if you hit zero free ! Either one needs finer granularity (isn't numerical limits in the work) or a new fatal color. I have licenses that run near 100 % all the time too. Gary Baluha wrote: The philosophy Hobbit uses for alerting is that you're okay until you reach a certain threshold. At that point (yellow) you still have to respond to the event and take care of it, before it becomes a bigger issue. If it continues, then you reach another threshold where stuff can (and usually does) break. At this point, you _need_ to respond to the event. What you are proposing is a fourth level such that you are "beyond critical". This is a similar concept to being "fatally killed" (as opposed to just being "killed"). The trick to running a successful monitoring system is setting the thresholds in the first place (which is easier said than done), such that you don't have any false-positives, but even more importantly, no false-negatives (i.e. an alert you should have gotten, but didn't). Can you give a more specific example (in as far as I.P./security will allow) of what you are trying to accomplish? On Fri, Jul 18, 2008 at 11:52 AM, michael nemeth <user-609d3fab5b2d@xymon.invalid> wrote:One case I can think of is for even 100% you've lots of but if you hits 0 free you HAVE to do some thing! Gary Baluha wrote: On Fri, Jul 18, 2008 at 10:59 AM, Jeff Newman <user-e96740e73ca8@xymon.invalid> wrote:Hi, didn't see a reply, so thought i'd do a resend in case it got lost in the shuffle Hi All, Two questions: QUESTION #1: Is it possible to have a third color alert? Meaning: One of my customers wants a setup like this: Custom script runs on client server, reports: foo : 80 for example. They want less than 85 to be green, 85-90 yellow, 90-95 red, and above 95 any color, say orange. So far as I can tell, I can only use green, yellow, and red for alerts, and blue and purple are reserved.Currently, no. But it might help to understand why 4 alert levels are desired. QUESTION #2:lets say #1 above is possible, so my script sends hobbit the status line based on the it sees, with the status of green, yellow, red, and orange. The hobbit server recieves it, and uses the NCV module to build the rrd etc.. In hobbit-alerts.cfg to say does the SERVICE keyword work for custom NCV type columns?The SERVICE tag in hobbit-alerts.cfg works for any column name, NCV or otherwise.
list Magnus Carlebjörk
Hi, I also have fairly large filesystems running nominally with just a few gigs left. I also run into the percentage problem. Looking through the description in hobbit-clients.cfg, there seem to be a solution: . . . # DISK filesystem IGNORE # If the utilization of "filesystem" is reported to exceed "warnlevel" # or "paniclevel", the "disk" status will go yellow or red, respectively. # "warnlevel" and "paniclevel" are either the percentage used, or the # space available as reported by the local "df" command on the host. # For the latter type of check, the "warnlevel" must be followed by the # letter "U", e.g. "1024U". . . . This would give us the possibility to test on explicit disk space. I tried configuring it (windows clients) but never got it right. Anyone else tried or is this not implemented yet? Could there be a formatting discrepancy from the windows client? I do see a difference between the capacity columns reported. In my setup, it is reported with % from the windows client and with 1k-blocks from unix clients. Could this affect the test? Regards, -- Magnus Carlebjork Stockholm, Sweden +46 76 116 9008
▸
On Fri, 18 Jul 2008, Gary Baluha wrote:
If that's the case, a fourth color would have the same limitation ;-) (That's a lot of disk space if 100% full = gigs of free space) With the lack of a finer granularity, the only option you have is to create a custom script (client-side or server-side should work in this case) that checks the _amount_ (as opposed to _percentage_) of free space, and set a green/yellow/red threshold based on that. You could then set up the Hobbit alert rules like any other test, and it sounds like this would solve your particular problem. (a client-side script would probably be the easiest to set up, depending on how many machines it would need to be propagated to) On Fri, Jul 18, 2008 at 2:57 PM, michael nemeth <user-609d3fab5b2d@xymon.invalid> wrote:Sorry, disagree! I can have gigs of space left at 100% not critical at all !!!! Its not "beyond critical" its fatal if you hit zero free ! Either one needs finer granularity (isn't numerical limits in the work) or a new fatal color. I have licenses that run near 100 % all the time too. Gary Baluha wrote: The philosophy Hobbit uses for alerting is that you're okay until you reach a certain threshold. At that point (yellow) you still have to respond to the event and take care of it, before it becomes a bigger issue. If it continues, then you reach another threshold where stuff can (and usually does) break. At this point, you _need_ to respond to the event. What you are proposing is a fourth level such that you are "beyond critical". This is a similar concept to being "fatally killed" (as opposed to just being "killed"). The trick to running a successful monitoring system is setting the thresholds in the first place (which is easier said than done), such that you don't have any false-positives, but even more importantly, no false-negatives (i.e. an alert you should have gotten, but didn't). Can you give a more specific example (in as far as I.P./security will allow) of what you are trying to accomplish? On Fri, Jul 18, 2008 at 11:52 AM, michael nemeth <user-609d3fab5b2d@xymon.invalid> wrote:One case I can think of is for even 100% you've lots of but if you hits 0 free you HAVE to do some thing! Gary Baluha wrote: On Fri, Jul 18, 2008 at 10:59 AM, Jeff Newman <user-e96740e73ca8@xymon.invalid> wrote:Hi, didn't see a reply, so thought i'd do a resend in case it got lost in the shuffle Hi All, Two questions: QUESTION #1: Is it possible to have a third color alert? Meaning: One of my customers wants a setup like this: Custom script runs on client server, reports: foo : 80 for example. They want less than 85 to be green, 85-90 yellow, 90-95 red, and above 95 any color, say orange. So far as I can tell, I can only use green, yellow, and red for alerts, and blue and purple are reserved.Currently, no. But it might help to understand why 4 alert levels are desired. QUESTION #2:lets say #1 above is possible, so my script sends hobbit the status line based on the it sees, with the status of green, yellow, red, and orange. The hobbit server recieves it, and uses the NCV module to build the rrd etc.. In hobbit-alerts.cfg to say does the SERVICE keyword work for custom NCV type columns?The SERVICE tag in hobbit-alerts.cfg works for any column name, NCV or otherwise.
list Michael Nemeth
Actually the licenses are better example, Right now I can create numeric limits of say 97-102 yellow, 103 to 121 red, but have no way of telling when I go over. And that the first quesion management going to ask, being they are very happy to see there money well spent with 100% utilization. My clearcase script DO return rejections. So with orange I could tell management how many times (at least that) and how long it was orange . Also, of course try to handle the orange condition! Point is a "Drop Dead, color is useful .
▸
Gary Baluha wrote:If that's the case, a fourth color would have the same limitation ;-) (That's a lot of disk space if 100% full = gigs of free space) With the lack of a finer granularity, the only option you have is to create a custom script (client-side or server-side should work in this case) that checks the _amount_ (as opposed to _percentage_) of free space, and set a green/yellow/red threshold based on that. You could then set up the Hobbit alert rules like any other test, and it sounds like this would solve your particular problem. (a client-side script would probably be the easiest to set up, depending on how many machines it would need to be propagated to) On Fri, Jul 18, 2008 at 2:57 PM, michael nemeth <user-609d3fab5b2d@xymon.invalid <mailto:user-609d3fab5b2d@xymon.invalid>> wrote: Sorry, disagree! I can have gigs of space left at 100% not critical at all !!!! Its not "beyond critical" its fatal if you hit zero free ! Either one needs finer granularity (isn't numerical limits in the work) or a new fatal color. I have that run near 100 % all the time too. Gary Baluha wrote:The philosophy Hobbit uses for alerting is that you're okay until you reach a certain threshold. At that point (yellow) you still have to respond to the event and take care of it, before it becomes a bigger issue. If it continues, then you reach another threshold where stuff can (and usually does) break. At this point, you _need_ to respond to the event. What you are proposing is a fourth level such that you are "beyond critical". This is a similar concept to being "fatally killed" (as opposed to just being "killed"). The trick to running a successful monitoring system is setting the thresholds in the first place (which is easier said than done), such that you don't have any false-positives, but even more importantly, no false-negatives (i.e. an alert you should have gotten, but didn't). Can you give a more specific example (in as far as I.P./security will allow) of what you are trying to accomplish? On Fri, Jul 18, 2008 at 11:52 AM, michael nemeth <user-609d3fab5b2d@xymon.invalid <mailto:user-609d3fab5b2d@xymon.invalid>> wrote: One case I can think of is for even 100% you've lots of but if you hits 0 free you HAVE to do some thing! Gary Baluha wrote:On Fri, Jul 18, 2008 at 10:59 AM, Jeff Newman <user-e96740e73ca8@xymon.invalid <mailto:user-e96740e73ca8@xymon.invalid>> wrote: Hi, didn't see a reply, so thought i'd do a resend in case it got lost in the shuffle Hi All, Two questions: QUESTION #1: Is it possible to have a third color alert? Meaning: One of my customers wants a setup like this: Custom script runs on client server, reports: foo : 80 for example. They want less than 85 to be green, 85-90 yellow, 90-95 red, and above 95 any color, say orange. So far as I can tell, I can only use green, yellow, and red for alerts, and blue and purple are reserved. Currently, no. But it might help to understand why 4 alert levels are desired. QUESTION #2: lets say #1 above is possible, so my script sends hobbit the status line based on the it sees, with the status of green, yellow, red, and orange. The hobbit server recieves it, and uses the NCV module to build the rrd etc.. In hobbit-alerts.cfg to say does the SERVICE keyword work for custom NCV type columns? The SERVICE tag in hobbit-alerts.cfg works for any column name, NCV or otherwise.
list Jeff Newman
Right. I think the concept is
Level 1: "warning everyone, something bad could happen, or might not,
may want to look"
- Yellow
Level 2: "Hey look, it was just a warning before, but now, it's bad
and service might
be interrupted unless you take action, this is your last
chance buddy!"
- Red
Level 3: "I've told you repeatedly, and now look whats happened! You've reached
super critical orange level! That means within minutes
your service will be dead.
run for the hills, the sky is falling, the phone is about
to ring non-stop"
- Orange
i think 3 levels makes sense for some specific applications.
-jeff
▸
On Mon, Jul 21, 2008 at 4:57 AM, michael nemeth <user-609d3fab5b2d@xymon.invalid> wrote:Actually the licenses are better example, Right now I can create numeric limits of say 97-102 yellow, 103 to 121 red, but have no way of telling when I go over. And that the first quesion management going to ask, being they are very happy to see there money well spent with 100% utilization. My clearcase script DO return rejections. So with orange I could tell management how many times (at least that) and how long it was orange . Also, of course try to handle the orange condition! Point is a "Drop Dead, color is useful . Gary Baluha wrote: If that's the case, a fourth color would have the same limitation ;-) (That's a lot of disk space if 100% full = gigs of free space) With the lack of a finer granularity, the only option you have is to create a custom script (client-side or server-side should work in this case) that checks the _amount_ (as opposed to _percentage_) of free space, and set a green/yellow/red threshold based on that. You could then set up the Hobbit alert rules like any other test, and it sounds like this would solve your particular problem. (a client-side script would probably be the easiest to set up, depending on how many machines it would need to be propagated to) On Fri, Jul 18, 2008 at 2:57 PM, michael nemeth <user-609d3fab5b2d@xymon.invalid> wrote:Sorry, disagree! I can have gigs of space left at 100% not critical at all !!!! Its not "beyond critical" its fatal if you hit zero free ! Either one needs finer granularity (isn't numerical limits in the work) or a new fatal color. I have that run near 100 % all the time too. Gary Baluha wrote: The philosophy Hobbit uses for alerting is that you're okay until you reach a certain threshold. At that point (yellow) you still have to respond to the event and take care of it, before it becomes a bigger issue. If it continues, then you reach another threshold where stuff can (and usually does) break. At this point, you _need_ to respond to the event. What you are proposing is a fourth level such that you are "beyond critical". This is a similar concept to being "fatally killed" (as opposed to just being "killed"). The trick to running a successful monitoring system is setting the thresholds in the first place (which is easier said than done), such that you don't have any false-positives, but even more importantly, no false-negatives (i.e. an alert you should have gotten, but didn't). Can you give a more specific example (in as far as I.P./security will allow) of what you are trying to accomplish? On Fri, Jul 18, 2008 at 11:52 AM, michael nemeth <user-609d3fab5b2d@xymon.invalid> wrote:One case I can think of is for even 100% you've lots of but if you hits 0 free you HAVE to do some thing! Gary Baluha wrote: On Fri, Jul 18, 2008 at 10:59 AM, Jeff Newman <user-e96740e73ca8@xymon.invalid> wrote:Hi, didn't see a reply, so thought i'd do a resend in case it got lost in the shuffle Hi All, Two questions: QUESTION #1: Is it possible to have a third color alert? Meaning: One of my customers wants a setup like this: Custom script runs on client server, reports: foo : 80 for example. They want less than 85 to be green, 85-90 yellow, 90-95 red, and above 95 any color, say orange. So far as I can tell, I can only use green, yellow, and red for alerts, and blue and purple are reserved.Currently, no. But it might help to understand why 4 alert levels are desired.QUESTION #2: lets say #1 above is possible, so my script sends hobbit the status line based on the it sees, with the status of green, yellow, red, and orange. The hobbit server recieves it, and uses the NCV module to build the rrd etc.. In hobbit-alerts.cfg to say does the SERVICE keyword work for custom NCV type columns?The SERVICE tag in hobbit-alerts.cfg works for any column name, NCV or otherwise.
list Tom Kauffman
Is the format of the 'status' message for the memory test documented anywhere? I'm trying to get the memory available (or used) from a P6 multi-lpar system to generate the built-in graphs. I'm getting the rrd created, but it has no values. TIA - Tom Kauffman NIBCO, Inc CONFIDENTIALITY NOTICE: This email and any attachments are for the exclusive and confidential use of the intended recipient. If you are not the intended recipient, please do not read, distribute or take action in reliance upon this message. If you have received this in error, please notify us immediately by return email and promptly delete this message and its attachments from your computer system. We do not waive attorney-client or work product privilege by the transmission of this message.
list Gary Baluha
I say the color should be brown, then... On Tue, Jul 22, 2008 at 11:57 AM, Jeff Newman <user-e96740e73ca8@xymon.invalid>
▸
wrote:
Right. I think the concept is Level 1: "warning everyone, something bad could happen, or might not, may want to look" - Yellow Level 2: "Hey look, it was just a warning before, but now, it's bad and service might be interrupted unless you take action, this is your last chance buddy!" - Red Level 3: "I've told you repeatedly, and now look whats happened! You've reached super critical orange level! That means within minutes your service will be dead. run for the hills, the sky is falling, the phone is about to ring non-stop" - Orange i think 3 levels makes sense for some specific applications. -jeff On Mon, Jul 21, 2008 at 4:57 AM, michael nemeth <user-609d3fab5b2d@xymon.invalid> wrote:Actually the licenses are better example, Right now I can create numeric limits of say 97-102 yellow, 103 to 121 red, but have no way of telling when I go over. And that the first quesion management going to ask, being they are very happy to see there money well spent with 100% utilization. My clearcase script DO return rejections. So with orange I could tell management how many times (at least that) and how long it was orange . Also, of course try to handle the orange condition! Point is a "Drop Dead, color is useful . Gary Baluha wrote: If that's the case, a fourth color would have the same limitation ;-) (That's a lot of disk space if 100% full = gigs of free space) With the lack of a finer granularity, the only option you have is to create a custom script (client-side or server-side should work in this case) that checks the _amount_ (as opposed to _percentage_) of free space, and set a green/yellow/red threshold based on that. You could then set up theHobbitalert rules like any other test, and it sounds like this would solve your particular problem. (a client-side script would probably be the easiest to set up, depending on how many machines it would need to be propagated to) On Fri, Jul 18, 2008 at 2:57 PM, michael nemeth <user-609d3fab5b2d@xymon.invalid wrote:Sorry, disagree! I can have gigs of space left at 100% not critical at all !!!! Its not "beyond critical" its fatal if you hit zero free ! Either one needs finer granularity (isn't numerical limits in the work) or a new fatal color. I have that run near 100 % all the time too. Gary Baluha wrote: The philosophy Hobbit uses for alerting is that you're okay until you reach a certain threshold. At that point (yellow) you still have to respond to the event and take care of it, before it becomes a bigger issue. If it continues, then you reach another threshold where stuff can (and usually does) break. At this point, you _need_ to respond to the event. What you are proposing is a fourth level such that you are "beyond critical". This is a similar concept to being "fatally killed" (as opposed to just being "killed"). The trick to running a successful monitoring system is setting the thresholds in the first place (which is easier said than done), such that you don't have any false-positives, but even more importantly, no false-negatives (i.e. an alert you should have gotten, but didn't). Can you give a more specific example (in as far as I.P./security will allow) of what you are trying to accomplish? On Fri, Jul 18, 2008 at 11:52 AM, michael nemeth < user-609d3fab5b2d@xymon.invalid> wrote:One case I can think of is for even 100% you've lots of but if you hits 0 free you HAVE to do some thing! Gary Baluha wrote: On Fri, Jul 18, 2008 at 10:59 AM, Jeff Newman <user-e96740e73ca8@xymon.invalid> wrote:Hi, didn't see a reply, so thought i'd do a resend in case it got lost in the shuffle Hi All, Two questions: QUESTION #1: Is it possible to have a third color alert? Meaning: One of my customers wants a setup like this: Custom script runs on client server, reports: foo : 80 for example. They want less than 85 to be green, 85-90 yellow, 90-95 red, and above 95 any color, say orange. So far as I can tell, I can only use green, yellow, and red for alerts, and blue and purple are reserved.Currently, no. But it might help to understand why 4 alert levels are desired.QUESTION #2: lets say #1 above is possible, so my script sends hobbit the status line based on the it sees, with the status of green, yellow, red, and orange. The hobbit server recieves it, and uses the NCV module to build the rrd etc.. In hobbit-alerts.cfg to say does the SERVICE keyword work for custom NCV type columns?The SERVICE tag in hobbit-alerts.cfg works for any column name, NCV or otherwise.
list Michael Nemeth
Yes that it. And as I said then I can track the orange Status for management.
▸
Jeff Newman wrote:Right. I think the concept is Level 1: "warning everyone, something bad could happen, or might not, may want to look" - Yellow Level 2: "Hey look, it was just a warning before, but now, it's bad and service might be interrupted unless you take action, this is your last chance buddy!" - Red Level 3: "I've told you repeatedly, and now look whats happened! You've reached super critical orange level! That means within minutes your service will be dead. run for the hills, the sky is falling, the phone is about to ring non-stop" - Orange i think 3 levels makes sense for some specific applications. -jeff On Mon, Jul 21, 2008 at 4:57 AM, michael nemeth <user-609d3fab5b2d@xymon.invalid> wrote:Actually the licenses are better example, Right now I can create numeric limits of say 97-102 yellow, 103 to 121 red, but have no way of telling when I go over. And that the first quesion management going to ask, being they are very happy to see there money well spent with 100% utilization. My clearcase script DO return rejections. So with orange I could tell management how many times (at least that) and how long it was orange . Also, of course try to handle the orange condition! Point is a "Drop Dead, color is useful . Gary Baluha wrote: If that's the case, a fourth color would have the same limitation ;-) (That's a lot of disk space if 100% full = gigs of free space) With the lack of a finer granularity, the only option you have is to create a custom script (client-side or server-side should work in this case) that checks the _amount_ (as opposed to _percentage_) of free space, and set a green/yellow/red threshold based on that. You could then set up the Hobbit alert rules like any other test, and it sounds like this would solve your particular problem. (a client-side script would probably be the easiest to set up, depending on how many machines it would need to be propagated to) On Fri, Jul 18, 2008 at 2:57 PM, michael nemeth <user-609d3fab5b2d@xymon.invalid> wrote:Sorry, disagree! I can have gigs of space left at 100% not critical at all !!!! Its not "beyond critical" its fatal if you hit zero free ! Either one needs finer granularity (isn't numerical limits in the work) or a new fatal color. I have that run near 100 % all the time too. Gary Baluha wrote: The philosophy Hobbit uses for alerting is that you're okay until you reach a certain threshold. At that point (yellow) you still have to respond to the event and take care of it, before it becomes a bigger issue. If it continues, then you reach another threshold where stuff can (and usually does) break. At this point, you _need_ to respond to the event. What you are proposing is a fourth level such that you are "beyond critical". This is a similar concept to being "fatally killed" (as opposed to just being "killed"). The trick to running a successful monitoring system is setting the thresholds in the first place (which is easier said than done), such that you don't have any false-positives, but even more importantly, no false-negatives (i.e. an alert you should have gotten, but didn't). Can you give a more specific example (in as far as I.P./security will allow) of what you are trying to accomplish? On Fri, Jul 18, 2008 at 11:52 AM, michael nemeth <user-609d3fab5b2d@xymon.invalid> wrote:One case I can think of is for even 100% you've lots of but if you hits 0 free you HAVE to do some thing! Gary Baluha wrote: On Fri, Jul 18, 2008 at 10:59 AM, Jeff Newman <user-e96740e73ca8@xymon.invalid> wrote:Hi, didn't see a reply, so thought i'd do a resend in case it got lost in the shuffle Hi All, Two questions: QUESTION #1: Is it possible to have a third color alert? Meaning: One of my customers wants a setup like this: Custom script runs on client server, reports: foo : 80 for example. They want less than 85 to be green, 85-90 yellow, 90-95 red, and above 95 any color, say orange. So far as I can tell, I can only use green, yellow, and red for alerts, and blue and purple are reserved.Currently, no. But it might help to understand why 4 alert levels are desired.QUESTION #2: lets say #1 above is possible, so my script sends hobbit the status line based on the it sees, with the status of green, yellow, red, and orange. The hobbit server recieves it, and uses the NCV module to build the rrd etc.. In hobbit-alerts.cfg to say does the SERVICE keyword work for custom NCV type columns?The SERVICE tag in hobbit-alerts.cfg works for any column name, NCV or otherwise.
list Michael Nemeth
Yes you're right IT does hit the fan then.
▸
Gary Baluha wrote:I say the color should be brown, then... On Tue, Jul 22, 2008 at 11:57 AM, Jeff Newman <user-e96740e73ca8@xymon.invalid <mailto:user-e96740e73ca8@xymon.invalid>> wrote: Right. I think the concept is Level 1: "warning everyone, something bad could happen, or might not, may want to look" - Yellow Level 2: "Hey look, it was just a warning before, but now, it's bad and service might be interrupted unless you take action, this is your last chance buddy!" - Red Level 3: "I've told you repeatedly, and now look whats happened! You've reached super critical orange level! That means within minutes your service will be dead. run for the hills, the sky is falling, the phone is about to ring non-stop" - Orange i think 3 levels makes sense for some specific applications. -jeff On Mon, Jul 21, 2008 at 4:57 AM, michael nemeth <user-609d3fab5b2d@xymon.invalid <mailto:user-609d3fab5b2d@xymon.invalid>> wrote:Actually the licenses are better example, Right now I can create numeric limits of say 97-102 yellow, 103 to 121 red, but have no way of telling when I go over. And that the first quesion management going to ask, being they are very happy to see there money well spent with 100% utilization. My clearcase script DO return rejections. So with orange I could tell management how many times (at least that) and how long it was orange . Also, of course try to handle the orange condition! Point is a "Drop Dead, color is useful . Gary Baluha wrote: If that's the case, a fourth color would have the same limitation ;-) (That's a lot of disk space if 100% full = gigs of free space) With the lack of a finer granularity, the only option you have is to create a custom script (client-side or server-side should work in this case) that checks the _amount_ (as opposed to _percentage_) of free space, and set a green/yellow/red threshold based on that. You could then set up the Hobbit alert rules like any other test, and it sounds like this would solve your particular problem. (a client-side script would probably be the easiest to set up, depending on how many machines it would need to be propagated to) On Fri, Jul 18, 2008 at 2:57 PM, michael nemeth<user-609d3fab5b2d@xymon.invalid <mailto:user-609d3fab5b2d@xymon.invalid>>wrote:Sorry, disagree! I can have gigs of space left at 100% not critical at all!!!! Its not"beyond critical" its fatal if you hit zero free ! Either one needs finer granularity (isn't numerical limits in the work) or a new fatal color. I have that run near 100 % all the time too. Gary Baluha wrote: The philosophy Hobbit uses for alerting is that you're okay until you reach a certain threshold. At that point (yellow) you still have to respond to the event and take care of it, before it becomes a bigger issue. If it continues, then you reach another threshold where stuff can(and usuallydoes) break. At this point, you _need_ to respond to the event. What you are proposing is a fourth level such that you are "beyond critical". This is a similar concept to being "fatally killed"(as opposedto just being "killed"). The trick to running a successful monitoring system is setting the thresholds in the first place (which is easier said than done), such that you don't have any false-positives, but even more importantly, no false-negatives (i.e. an alert you should have gotten, but didn't). Can you give a more specific example (in as far asI.P./security willallow) of what you are trying to accomplish? On Fri, Jul 18, 2008 at 11:52 AM, michael nemeth<user-609d3fab5b2d@xymon.invalid <mailto:user-609d3fab5b2d@xymon.invalid>>wrote:One case I can think of is for even 100% you've lots of but if you hits 0 free you HAVE to do some thing! Gary Baluha wrote: On Fri, Jul 18, 2008 at 10:59 AM, Jeff Newman<user-e96740e73ca8@xymon.invalid <mailto:user-e96740e73ca8@xymon.invalid>>wrote:Hi, didn't see a reply, so thought i'd do a resend in case it got lost in the shuffle Hi All, Two questions: QUESTION #1: Is it possible to have a third color alert? Meaning: One of my customers wants a setup like this: Custom script runs on client server, reports: foo : 80 for example. They want less than 85 to be green, 85-90 yellow, 90-95 red, and above 95 any color, say orange. So far as I can tell, I can only use green, yellow, and red for alerts, and blue and purple are reserved.Currently, no. But it might help to understand why 4 alert levels are desired.QUESTION #2: lets say #1 above is possible, so my script sends hobbit the status line based on the it sees, with the status of green, yellow, red, and orange. The hobbit server recieves it, and uses the NCV module to build the rrd etc.. In hobbit-alerts.cfg to say does the SERVICE keyword work for custom NCV type columns?The SERVICE tag in hobbit-alerts.cfg works for any column name, NCV or otherwise.
list Dan McDonald
▸
On Wed, 2008-07-23 at 06:22 -0400, michael nemeth wrote:
Yes you're right IT does hit the fan then. Gary Baluha wrote:I say the color should be brown, then...
Brown and orange are the same hue, just different intensities :-) But I would also like a "hey it's really, really bad" notification. We page when there is 30 minutes left on a UPS. I'd like to page again when there is 5 minutes left (and it's not always 25 minutes later...) Another instance is that we page when it is over 80 degrees F in a comm room. I'd like to page again when it gets to 100... I know I could write additional tests and have columns for both upsmin and upsfailimminent, but a third threshold would be easier to maintain...
▸
On Tue, Jul 22, 2008 at 11:57 AM, Jeff Newman <user-e96740e73ca8@xymon.invalid> wrote: Right. I think the concept is Level 1: "warning everyone, something bad could happen, or might not, may want to look" - Yellow Level 2: "Hey look, it was just a warning before, but now, it's bad and service might be interrupted unless you take action, this is your last chance buddy!" - Red Level 3: "I've told you repeatedly, and now look whats happened! You've reached super critical orange level! That means within minutes your service will be dead. run for the hills, the sky is falling, the phone is about to ring non-stop" - Orange i think 3 levels makes sense for some specific applications.
--
Daniel J McDonald, CCIE #2495, CISSP #78281, CNX Austin Energy http://www.austinenergy.com
list Rdeal
You could page on yellow at 80F and then again at red when the room comes to 100F... Same with the ups, yellow is 30 minutes left on UPS, red is 5 minutes left... You wouldn't want to see any of these on anything other than green.
▸
From: "McDonald, Dan" <user-290ce4e24e19@xymon.invalid> Reply-To: <user-ae9b8668bcde@xymon.invalid> Date: Wed, 23 Jul 2008 07:15:50 -0500 To: <user-ae9b8668bcde@xymon.invalid> Subject: Re: [hobbit] resend: 2 questions On Wed, 2008-07-23 at 06:22 -0400, michael nemeth wrote:Yes you're right IT does hit the fan then. Gary Baluha wrote:I say the color should be brown, then...Brown and orange are the same hue, just different intensities :-) But I would also like a "hey it's really, really bad" notification. We page when there is 30 minutes left on a UPS. I'd like to page again when there is 5 minutes left (and it's not always 25 minutes later...) Another instance is that we page when it is over 80 degrees F in a comm room. I'd like to page again when it gets to 100... I know I could write additional tests and have columns for both upsmin and upsfailimminent, but a third threshold would be easier to maintain...On Tue, Jul 22, 2008 at 11:57 AM, Jeff Newman <user-e96740e73ca8@xymon.invalid> wrote: Right. I think the concept is Level 1: "warning everyone, something bad could happen, or might not, may want to look" - Yellow Level 2: "Hey look, it was just a warning before, but now, it's bad and service might be interrupted unless you take action, this is your last chance buddy!" - Red Level 3: "I've told you repeatedly, and now look whats happened! You've reached super critical orange level! That means within minutes your service will be dead. run for the hills, the sky is falling, the phone is about to ring non-stop" - Orange i think 3 levels makes sense for some specific applications.-- Daniel J McDonald, CCIE #2495, CISSP #78281, CNX Austin Energy http://www.austinenergy.com
list Jeff Newman
Well, the first step is getting hobbit to recognize and act upon a third color (be it brown or orange) I'm not a good enough coder to do this. -Jeff On Wed, Jul 23, 2008 at 7:15 AM, McDonald, Dan
▸
<user-290ce4e24e19@xymon.invalid> wrote:On Wed, 2008-07-23 at 06:22 -0400, michael nemeth wrote:Yes you're right IT does hit the fan then. Gary Baluha wrote:I say the color should be brown, then...Brown and orange are the same hue, just different intensities :-) But I would also like a "hey it's really, really bad" notification. We page when there is 30 minutes left on a UPS. I'd like to page again when there is 5 minutes left (and it's not always 25 minutes later...) Another instance is that we page when it is over 80 degrees F in a comm room. I'd like to page again when it gets to 100... I know I could write additional tests and have columns for both upsmin and upsfailimminent, but a third threshold would be easier to maintain...On Tue, Jul 22, 2008 at 11:57 AM, Jeff Newman <user-e96740e73ca8@xymon.invalid> wrote: Right. I think the concept is Level 1: "warning everyone, something bad could happen, or might not, may want to look" - Yellow Level 2: "Hey look, it was just a warning before, but now, it's bad and service might be interrupted unless you take action, this is your last chance buddy!" - Red Level 3: "I've told you repeatedly, and now look whats happened! You've reached super critical orange level! That means within minutes your service will be dead. run for the hills, the sky is falling, the phone is about to ring non-stop" - Orange i think 3 levels makes sense for some specific applications.-- Daniel J McDonald, CCIE #2495, CISSP #78281, CNX Austin Energy http://www.austinenergy.com
list Jeff Newman
We know this. I don't think the question is why have 3 colors, others have expressed desire for a third color, the question is how.
▸
On Wed, Jul 23, 2008 at 7:35 AM, rdeal <user-a44af7422b8a@xymon.invalid> wrote:You could page on yellow at 80F and then again at red when the room comes to 100F... Same with the ups, yellow is 30 minutes left on UPS, red is 5 minutes left... You wouldn't want to see any of these on anything other than green.From: "McDonald, Dan" <user-290ce4e24e19@xymon.invalid> Reply-To: <user-ae9b8668bcde@xymon.invalid> Date: Wed, 23 Jul 2008 07:15:50 -0500 To: <user-ae9b8668bcde@xymon.invalid> Subject: Re: [hobbit] resend: 2 questions On Wed, 2008-07-23 at 06:22 -0400, michael nemeth wrote:Yes you're right IT does hit the fan then. Gary Baluha wrote:I say the color should be brown, then...Brown and orange are the same hue, just different intensities :-) But I would also like a "hey it's really, really bad" notification. We page when there is 30 minutes left on a UPS. I'd like to page again when there is 5 minutes left (and it's not always 25 minutes later...) Another instance is that we page when it is over 80 degrees F in a comm room. I'd like to page again when it gets to 100... I know I could write additional tests and have columns for both upsmin and upsfailimminent, but a third threshold would be easier to maintain...On Tue, Jul 22, 2008 at 11:57 AM, Jeff Newman <user-e96740e73ca8@xymon.invalid> wrote: Right. I think the concept is Level 1: "warning everyone, something bad could happen, or might not, may want to look" - Yellow Level 2: "Hey look, it was just a warning before, but now, it's bad and service might be interrupted unless you take action, this is your last chance buddy!" - Red Level 3: "I've told you repeatedly, and now look whats happened! You've reached super critical orange level! That means within minutes your service will be dead. run for the hills, the sky is falling, the phone is about to ring non-stop" - Orange i think 3 levels makes sense for some specific applications.-- Daniel J McDonald, CCIE #2495, CISSP #78281, CNX Austin Energy http://www.austinenergy.com