Xymon Mailing List Archive search

Duration reset on status change

3 messages in this thread

list Patrick Vaughan · Sun, 12 Aug 2007 21:54:47 -0400 ·
I know this has been discussed before, but I have a slight problem with
the way duration is calculated.

I need to send an Email if a test is yellow, and escalating pages if it
is red.  My problem is that a test went yellow over the weekend when
nobody was checking Email, then in the middle of the night went to red.
Which caused techs, backup techs, managers, and the CEO to be paged
because the duration was over 24 hours.

Is there any way to reset the duration counter when the state changes.
Instead of just when the state changes from green?

A new home for Mom, no cleanup required. All starts here. 
http://www.reallivemoms.com?ocid=TXT_TAGHM&loc=us
list Henrik Størner · Mon, 13 Aug 2007 15:36:17 +0000 (UTC) ·
quoted from Patrick Vaughan
In <user-c504ddc180d2@xymon.invalid> "Patrick Vaughan" <user-37bf2569640f@xymon.invalid> writes:
I need to send an Email if a test is yellow, and escalating pages if it
is red.  My problem is that a test went yellow over the weekend when
nobody was checking Email, then in the middle of the night went to red.
Which caused techs, backup techs, managers, and the CEO to be paged
because the duration was over 24 hours.
Is there any way to reset the duration counter when the state changes.
Instead of just when the state changes from green?
No.

The problem with this is that you'll get tons of alerts if a status
is wobbling just around the red threshold.

I do understand your predicament with the current way alerting works,
but right now I haven't found the optimum way of doing it.


Regards,
Henrik
list Pat Vaughan · Tue, 21 Aug 2007 14:11:33 -0400 ·
Just my $.02, but I would make a second DURATION type counter that gets
reset on every state change and add a keyword called something like
LASTCHANGE.  That way the user can decide how he wants to handle
duration type rules.  Some users may want rules based on the last time a
test was "green" and other rules based on the last time the state changed.
quoted from Henrik Størner


Henrik Stoerner wrote:
In <user-c504ddc180d2@xymon.invalid> "Patrick Vaughan" <user-37bf2569640f@xymon.invalid> writes:

  
I need to send an Email if a test is yellow, and escalating pages if it
is red.  My problem is that a test went yellow over the weekend when
nobody was checking Email, then in the middle of the night went to red.
Which caused techs, backup techs, managers, and the CEO to be paged
because the duration was over 24 hours.
    
Is there any way to reset the duration counter when the state changes.
Instead of just when the state changes from green?
    
No.

The problem with this is that you'll get tons of alerts if a status
is wobbling just around the red threshold.

I do understand your predicament with the current way alerting works,
but right now I haven't found the optimum way of doing it.


Regards,
Henrik