Xymon Mailing List Archive search

hobbit-alerts.cfg - DURATION

8 messages in this thread

list Anatoli Bogajewski · Tue, 13 Mar 2007 15:22:36 +0100 ·
Dear Hobbits,

is DURATION keyword within hobbit-alerts.cfg relates to the time period one test is in a special state, yellow or red, or more general the time period since non-green state occurs. In example, i want to get exact one notification at yellow state and one at red, but the following configuration does not work. I get notified on initial yellow alert, but not on red one occurring 4 min later.

HOST=myhost SERVICE=disk COLOR=yellow DURATION<3
SCRIPT $SSMSS $ABSMS REPEAT=5 RECOVERED

HOST=myhost SERVICE=disk COLOR=red DURATION<3
SCRIPT $SSMSS $ABSMS REPEAT=5 RECOVERED

Any ideas? Thanks :-)

Mit freundlichen Grüßen / Yours sincerely
 Anatoli Bogajewski
list Gary Baluha · Tue, 13 Mar 2007 11:53:25 -0400 ·
In general, Hobbit processes the hobbit-alerts.cfg file from top-down, using
the first matching alert.  At least in my experience with the way we're
using DURATION, it should be counting the time from when the alert changes
status (so, green-to-yellow, yellow-to-red, etc), though to be fair, we're
using it differently than you are in this example.

Try using the bbcmd "hobbitd_alert" test below to see if it is working as
intended.  It can be used as below:
/var/hobbit/server/bin/bbcmd hobbitd_alert --test <hostname> <host test>

Also, you might want to consider using DURATION<3m (specifying "m" for
minutes).  I'm not sure what the default is, but I personally prefer to be
explicit; makes reading it a little easier as well.  And if REPEAT=5,
assuming 5 is in minutes, you'll never get a repeat message for yellow or
red status if DURATION<3m.
quoted from Anatoli Bogajewski


Dear Hobbits,
is DURATION keyword within hobbit-alerts.cfg relates to the time period
one test is in a special state, yellow or red, or more general the time
period since non-green state occurs. In example, i want to get exact one
notification at yellow state and one at red, but the following
configuration does not work. I get notified on initial yellow alert, but
not on red one occurring 4 min later.

HOST=myhost SERVICE=disk COLOR=yellow DURATION<3
SCRIPT $SSMSS $ABSMS REPEAT=5 RECOVERED

HOST=myhost SERVICE=disk COLOR=red DURATION<3
SCRIPT $SSMSS $ABSMS REPEAT=5 RECOVERED

Any ideas? Thanks :-)

Mit freundlichen Grüßen / Yours sincerely

Anatoli Bogajewski

list Anatoli Bogajewski · Wed, 14 Mar 2007 12:01:36 +0100 ·
Hello, 
thanks for ur reply.

user-ae3e15c22de1@xymon.invalid schrieb am 13.03.2007 16:53:25:
it should be counting the time from when the alert changes status (so, green-to-yellow, yellow-to-red, etc)
thought so
quoted from Gary Baluha
Try using the bbcmd "hobbitd_alert" test below to see if it is working as intended.  It can be used as below:
/var/hobbit/server/bin/bbcmd hobbitd_alert --test <hostname> <host test>
works in principle as expected, although there is no possibility to reproduce my scenario using test utility
Also, you might want to consider using DURATION<3m (specifying "m" for minutes).  I'm not sure what the default is, but I personally prefer to be explicit; makes reading it a little easier as well. 
from man pages: "The duration is specified as a number, _optionally_ followed by 'm' (minutes, default), 'h' (hours) or 'd' (days)."

--debug output of hobbitd_alert looks like:

(initial alert yellow)

2007-03-13 14:38:58 hobbitd_alert: Got message 1139 @@page#1139|1173793138.770212|xx.xx.xx.xx|myhost|disk|xx.xx.xx.xx|1173794938|yellow|green|1173793138|pct|643201|||
2007-03-13 14:38:58 startpos 2590, fillpos 2590, endpos -1
2007-03-13 14:38:58 Got page message from myhost:disk
2007-03-13 14:38:58 Alert status changed from 0 to 1
2007-03-13 14:38:58 Found a first matching rule
2007-03-13 14:38:58 No more secondary matching rule
2007-03-13 14:38:58 1 alerts to go
2007-03-13 14:38:58 Found a first matching rule
2007-03-13 14:38:58 send_alert myhost:disk state 0
2007-03-13 14:38:58 No more secondary matching rule
2007-03-13 14:38:58 Want msg 1140, startpos 2590, fillpos 2590, endpos -1, usedbytes=0, bufleft=263649
2007-03-13 14:38:58 Found a first matching rule
2007-03-13 14:38:58   repeat myhost|disk|script|0123456789 at 0
2007-03-13 14:38:58   Alert for myhost:disk to 0123456789
2007-03-13 14:38:58 Opening file /opt/hobbit/server/etc/bb-hosts

(4min later red alert raises)

2007-03-13 14:42:49 hobbitd_alert: Got message 1223 @@page#1223|1173793369.998387|xx.xx.xx.xx|myhost|disk|xx.xx.xx.xx|1173795169|red|yellow|1173793369|pct|643201|||
2007-03-13 14:42:49 startpos 47243, fillpos 47243, endpos -1
2007-03-13 14:42:49 Got page message from myhost:disk
2007-03-13 14:42:49 Severity increased, cleared repeat interval: myhost/disk yellow->red
2007-03-13 14:42:49 Found no first matching rule
2007-03-13 14:42:49 Want msg 1224, startpos 47243, fillpos 47243, endpos -1, usedbytes=0, bufleft=218996

so hm. i am not sure i got any lines of interest, but this looks not very helpfull.

Chears,
Anatoli
quoted from Gary Baluha

Dear Hobbits,

is DURATION keyword within hobbit-alerts.cfg relates to the time period
one test is in a special state, yellow or red, or more general the time
period since non-green state occurs. In example, i want to get exact one
notification at yellow state and one at red, but the following configuration does not work. I get notified on initial yellow alert, but
not on red one occurring 4 min later.

HOST=myhost SERVICE=disk COLOR=yellow DURATION<3
SCRIPT $SSMSS $ABSMS REPEAT=5 RECOVERED

HOST=myhost SERVICE=disk COLOR=red DURATION<3
SCRIPT $SSMSS $ABSMS REPEAT=5 RECOVERED

Any ideas? Thanks :-)

Mit freundlichen Grüßen / Yours sincerely

Anatoli Bogajewski

list Larry Barber · Wed, 14 Mar 2007 10:28:33 -0500 ·
I think you have the inequality backwards on your DURATION clause, as it is
written no alert will be issued for for alerts that are older than 3
minutes, probably should be DURATION>3, not DURATION<3.

Thanks,
Larry Barber

On 3/14/07, user-4d3800b5a33a@xymon.invalid <
quoted from Anatoli Bogajewski
user-4d3800b5a33a@xymon.invalid> wrote:
Hello,

thanks for ur reply.

user-ae3e15c22de1@xymon.invalid schrieb am 13.03.2007 16:53:25:
it should be counting the time from when the alert changes status
(so, green-to-yellow, yellow-to-red, etc)
thought so
Try using the bbcmd "hobbitd_alert" test below to see if it is
working as intended.  It can be used as below:
/var/hobbit/server/bin/bbcmd hobbitd_alert --test <hostname> <host test>
works in principle as expected, although there is no possibility to
reproduce my scenario using test utility
Also, you might want to consider using DURATION<3m (specifying "m"
for minutes).  I'm not sure what the default is, but I personally
prefer to be explicit; makes reading it a little easier as well.
from man pages: "The duration is specified as a number, _optionally_
followed by 'm' (minutes, default), 'h' (hours) or 'd' (days)."

--debug output of hobbitd_alert looks like:

(initial alert yellow)

2007-03-13 14:38:58 hobbitd_alert: Got message 1139

@@page#1139|1173793138.770212|xx.xx.xx.xx|myhost|disk|xx.xx.xx.xx|1173794938|yellow|green|1173793138|pct|643201|||
2007-03-13 14:38:58 startpos 2590, fillpos 2590, endpos -1
2007-03-13 14:38:58 Got page message from myhost:disk
2007-03-13 14:38:58 Alert status changed from 0 to 1
2007-03-13 14:38:58 Found a first matching rule
2007-03-13 14:38:58 No more secondary matching rule
2007-03-13 14:38:58 1 alerts to go
2007-03-13 14:38:58 Found a first matching rule
2007-03-13 14:38:58 send_alert myhost:disk state 0
2007-03-13 14:38:58 No more secondary matching rule
2007-03-13 14:38:58 Want msg 1140, startpos 2590, fillpos 2590, endpos -1,
usedbytes=0, bufleft=263649
2007-03-13 14:38:58 Found a first matching rule
2007-03-13 14:38:58   repeat myhost|disk|script|0123456789 at 0
2007-03-13 14:38:58   Alert for myhost:disk to 0123456789
2007-03-13 14:38:58 Opening file /opt/hobbit/server/etc/bb-hosts

(4min later red alert raises)

2007-03-13 14:42:49 hobbitd_alert: Got message 1223

@@page#1223|1173793369.998387|xx.xx.xx.xx|myhost|disk|xx.xx.xx.xx|1173795169|red|yellow|1173793369|pct|643201|||
2007-03-13 14:42:49 startpos 47243, fillpos 47243, endpos -1
2007-03-13 14:42:49 Got page message from myhost:disk
2007-03-13 14:42:49 Severity increased, cleared repeat interval:
myhost/disk yellow->red
2007-03-13 14:42:49 Found no first matching rule
2007-03-13 14:42:49 Want msg 1224, startpos 47243, fillpos 47243, endpos
-1, usedbytes=0, bufleft=218996

so hm. i am not sure i got any lines of interest, but this looks not very
helpfull.

Chears,
Anatoli

Dear Hobbits,

is DURATION keyword within hobbit-alerts.cfg relates to the time period
one test is in a special state, yellow or red, or more general the time
period since non-green state occurs. In example, i want to get exact one
notification at yellow state and one at red, but the following
configuration does not work. I get notified on initial yellow alert, but
not on red one occurring 4 min later.

HOST=myhost SERVICE=disk COLOR=yellow DURATION<3
SCRIPT $SSMSS $ABSMS REPEAT=5 RECOVERED

HOST=myhost SERVICE=disk COLOR=red DURATION<3
SCRIPT $SSMSS $ABSMS REPEAT=5 RECOVERED

Any ideas? Thanks :-)

Mit freundlichen Grüßen / Yours sincerely

Anatoli Bogajewski

list Gary Baluha · Wed, 14 Mar 2007 15:13:53 -0400 ·
Now that I think of it, if the goal is just to have the alert send an email
once, you probably just want to remove the REPEAT= part (not sure if there
is a default for this), or optionally change it to something like
REPEAT=1d.  In that case, the DURATION isn't needed.
quoted from Larry Barber

On 3/14/07, Larry Barber <user-6ef9c2864140@xymon.invalid> wrote:
I think you have the inequality backwards on your DURATION clause, as it
is written no alert will be issued for for alerts that are older than 3
minutes, probably should be DURATION>3, not DURATION<3.

Thanks,
Larry Barber

On 3/14/07, user-4d3800b5a33a@xymon.invalid <user-4d3800b5a33a@xymon.invalid>
wrote:
Hello,

thanks for ur reply.

user-ae3e15c22de1@xymon.invalid schrieb am 13.03.2007 16:53:25:
it should be counting the time from when the alert changes status
(so, green-to-yellow, yellow-to-red, etc)
thought so
Try using the bbcmd "hobbitd_alert" test below to see if it is
working as intended.  It can be used as below:
/var/hobbit/server/bin/bbcmd hobbitd_alert --test <hostname> <host
test>
works in principle as expected, although there is no possibility to
reproduce my scenario using test utility
Also, you might want to consider using DURATION<3m (specifying "m"
for minutes).  I'm not sure what the default is, but I personally
prefer to be explicit; makes reading it a little easier as well.
from man pages: "The duration is specified as a number, _optionally_
followed by 'm' (minutes, default), 'h' (hours) or 'd' (days)."

--debug output of hobbitd_alert looks like:

(initial alert yellow)

2007-03-13 14:38:58 hobbitd_alert: Got message 1139
@@page#1139|1173793138.770212|xx.xx.xx.xx|myhost|disk|xx.xx.xx.xx|1173794938|yellow|green|1173793138|pct|643201|||

2007-03-13 14:38:58 startpos 2590, fillpos 2590, endpos -1
2007-03-13 14:38:58 Got page message from myhost:disk
2007-03-13 14:38:58 Alert status changed from 0 to 1
2007-03-13 14:38:58 Found a first matching rule
2007-03-13 14:38:58 No more secondary matching rule
2007-03-13 14:38:58 1 alerts to go
2007-03-13 14:38:58 Found a first matching rule
2007-03-13 14:38:58 send_alert myhost:disk state 0
2007-03-13 14:38:58 No more secondary matching rule
2007-03-13 14:38:58 Want msg 1140, startpos 2590, fillpos 2590, endpos
-1,
usedbytes=0, bufleft=263649
2007-03-13 14:38:58 Found a first matching rule
2007-03-13 14:38:58   repeat myhost|disk|script|0123456789 at 0
2007-03-13 14:38:58   Alert for myhost:disk to 0123456789
2007-03-13 14:38:58 Opening file /opt/hobbit/server/etc/bb-hosts

(4min later red alert raises)

2007-03-13 14:42:49 hobbitd_alert: Got message 1223

@@page#1223|1173793369.998387|xx.xx.xx.xx|myhost|disk|xx.xx.xx.xx|1173795169|red|yellow|1173793369|pct|643201|||
2007-03-13 14:42:49 startpos 47243, fillpos 47243, endpos -1
2007-03-13 14:42:49 Got page message from myhost:disk
2007-03-13 14:42:49 Severity increased, cleared repeat interval:
myhost/disk yellow->red
2007-03-13 14:42:49 Found no first matching rule
2007-03-13 14:42:49 Want msg 1224, startpos 47243, fillpos 47243, endpos

-1, usedbytes=0, bufleft=218996

so hm. i am not sure i got any lines of interest, but this looks not
very
helpfull.

Chears,
Anatoli

Dear Hobbits,

is DURATION keyword within hobbit-alerts.cfg relates to the time
period
one test is in a special state, yellow or red, or more general the
time
period since non-green state occurs. In example, i want to get exact
one
notification at yellow state and one at red, but the following
configuration does not work. I get notified on initial yellow alert,
but
not on red one occurring 4 min later.

HOST=myhost SERVICE=disk COLOR=yellow DURATION<3
SCRIPT $SSMSS $ABSMS REPEAT=5 RECOVERED

HOST=myhost SERVICE=disk COLOR=red DURATION<3
SCRIPT $SSMSS $ABSMS REPEAT=5 RECOVERED

Any ideas? Thanks :-)

Mit freundlichen Grüßen / Yours sincerely

Anatoli Bogajewski

list Manny Cortes · Wed, 14 Mar 2007 18:29:52 -0400 ·
We use DURATION in our case as a way to escalate notifications to another group of recipients 15 minutes after the initial event occurred in hobbit. The initial alert goes to our onsite Operations folks then after 15 minutes, a custom script fires off that informs all in that particular recipient group that the event is still ongoing and it is being escalated.
 
    so: qpage pages Operations as soon as the event occurs and they monitor the event
         DURATION>15: the second script fires off.
 
Working pretty well so far....
 
Could REPEAT be used for further escalation? Or will another DURATION>30 suffice?
 
My 2 cents :)
 
Manny

	-----Original Message----- 
	From: Gary Baluha [mailto:user-ae3e15c22de1@xymon.invalid] 
	Sent: Wed 3/14/2007 3:13 PM 
	To: user-ae9b8668bcde@xymon.invalid 
	Cc: 
	Subject: Re: [hobbit] hobbit-alerts.cfg - DURATION
quoted from Gary Baluha
	
	
	Now that I think of it, if the goal is just to have the alert send an email once, you probably just want to remove the REPEAT= part (not sure if there is a default for this), or optionally change it to something like REPEAT=1d.  In that case, the DURATION isn't needed. 
	
	
	On 3/14/07, Larry Barber <user-6ef9c2864140@xymon.invalid> wrote: 

		I think you have the inequality backwards on your DURATION clause, as it is written no alert will be issued for for alerts that are older than 3 minutes, probably should be DURATION>3, not DURATION<3.
		
		Thanks, 
		Larry Barber 
		
		
		On 3/14/07, user-4d3800b5a33a@xymon.invalid < user-4d3800b5a33a@xymon.invalid <mailto:user-4d3800b5a33a@xymon.invalid> > wrote: 

			Hello,
			
			thanks for ur reply. 
			
			user-ae3e15c22de1@xymon.invalid schrieb am 13.03.2007 16:53:25:
			
it should be counting the time from when the alert changes status 
(so, green-to-yellow, yellow-to-red, etc) 
			
			thought so
			
Try using the bbcmd "hobbitd_alert" test below to see if it is
working as intended.  It can be used as below:
/var/hobbit/server/bin/bbcmd hobbitd_alert --test <hostname> <host test> 
			
			works in principle as expected, although there is no possibility to
			reproduce my scenario using test utility
			
Also, you might want to consider using DURATION<3m (specifying "m"
for minutes).  I'm not sure what the default is, but I personally 
prefer to be explicit; makes reading it a little easier as well.
			
			from man pages: "The duration is specified as a number, _optionally_
			followed by 'm' (minutes, default), 'h' (hours) or 'd' (days)." 
			
			--debug output of hobbitd_alert looks like:
			
			(initial alert yellow)
			
			2007-03-13 14:38:58 hobbitd_alert: Got message 1139
			@@page#1139|1173793138.770212|xx.xx.xx.xx|myhost|disk|xx.xx.xx.xx|1173794938|yellow|green|1173793138|pct|643201||| 
			2007-03-13 14:38:58 startpos 2590, fillpos 2590, endpos -1
			2007-03-13 14:38:58 Got page message from myhost:disk
			2007-03-13 14:38:58 Alert status changed from 0 to 1
			2007-03-13 14:38:58 Found a first matching rule 
			2007-03-13 14:38:58 No more secondary matching rule
			2007-03-13 14:38:58 1 alerts to go
			2007-03-13 14:38:58 Found a first matching rule
			2007-03-13 14:38:58 send_alert myhost:disk state 0
			2007-03-13 14:38:58 No more secondary matching rule 
			2007-03-13 14:38:58 Want msg 1140, startpos 2590, fillpos 2590, endpos -1,
			usedbytes=0, bufleft=263649
			2007-03-13 14:38:58 Found a first matching rule
			2007-03-13 14:38:58   repeat myhost|disk|script|0123456789 at 0 
			2007-03-13 14:38:58   Alert for myhost:disk to 0123456789
			2007-03-13 14:38:58 Opening file /opt/hobbit/server/etc/bb-hosts
			
			(4min later red alert raises)
			
			2007-03-13 14:42:49 hobbitd_alert: Got message 1223 
			@@page#1223|1173793369.998387|xx.xx.xx.xx|myhost|disk|xx.xx.xx.xx|1173795169|red|yellow|1173793369|pct|643201|||
			2007-03-13 14:42:49 startpos 47243, fillpos 47243, endpos -1
			2007-03-13 14:42:49 Got page message from myhost:disk 
			2007-03-13 14:42:49 Severity increased, cleared repeat interval:
			myhost/disk yellow->red
			2007-03-13 14:42:49 Found no first matching rule
			2007-03-13 14:42:49 Want msg 1224, startpos 47243, fillpos 47243, endpos 
			-1, usedbytes=0, bufleft=218996
			
			so hm. i am not sure i got any lines of interest, but this looks not very
			helpfull.
			
			Chears,
			Anatoli
			
			
Dear Hobbits,

is DURATION keyword within hobbit-alerts.cfg relates to the time period
one test is in a special state, yellow or red, or more general the time
period since non-green state occurs. In example, i want to get exact one
notification at yellow state and one at red, but the following 
configuration does not work. I get notified on initial yellow alert, but
not on red one occurring 4 min later.

HOST=myhost SERVICE=disk COLOR=yellow DURATION<3
SCRIPT $SSMSS $ABSMS REPEAT=5 RECOVERED 

HOST=myhost SERVICE=disk COLOR=red DURATION<3
SCRIPT $SSMSS $ABSMS REPEAT=5 RECOVERED

Any ideas? Thanks :-)

Mit freundlichen Grüßen / Yours sincerely

Anatoli Bogajewski

 
This e-mail message and any attached files are confidential and are intended solely for the use of the addressee(s) named above. If you are not the intended recipient, any review, use, or distribution of this e-mail message and any attached files is strictly prohibited. This communication may contain material protected by Federal privacy regulations, attorney-client work product, or other privileges. If you have received this confidential communication in error, please notify the sender immediately by reply e-mail message and permanently delete the original message.  To reply to our email administrator directly, send an email to:  user-ecde3bbc361d@xymon.invalid .  If this e-mail message concerns a contract matter, be advised that no employee or agent is authorized to conclude any binding agreement on behalf of Orlando Regional Healthcare by e-mail without express written confirmation by an officer of the corporation. Any views or opinions presented in this e-mail are solely those of the author and do not necessarily represent those of Orlando Regional Healthcare.
list Johann Eggers · Thu, 15 Mar 2007 10:03:58 +0100 ·
quoted from Manny Cortes
-----Original Message-----
From: Cortes, Manny [mailto:user-4d8222bd9f10@xymon.invalid]
Sent: Mittwoch, 14. März 2007 23:30
To: user-ae9b8668bcde@xymon.invalid
Subject: RE: [hobbit] hobbit-alerts.cfg - DURATION

We use DURATION in our case as a way to escalate notifications to another
group of recipients 15 minutes after the initial event occurred in hobbit.
The initial alert goes to our onsite Operations folks then after 15
minutes, a custom script fires off that informs all in that particular
recipient group that the event is still ongoing and it is being escalated.

    so: qpage pages Operations as soon as the event occurs and they
monitor the event
         DURATION>15: the second script fires off.

Working pretty well so far....

Could REPEAT be used for further escalation? Or will another DURATION>30
suffice?
This is from the hobbit-alerts.cfg man-page:

Rule matcing an alert if the event has lasted longer/shorter than the given duration. E.g. DURATION>1h (lasted longer than 1 hour) or DURATION<30 (only sends alerts the first 30 minutes).

That's exactly the way we are using the DURATION tag.
We've specified on most of the alert rules a DURATION>5 because often a test fails and becomes back green after e.g. 2 minutes. So in this case we don't want to get alarmed by mail or SMS.
If the RED condition is still valid after more than 5 minutes then send out an alarm.
We also use the REPEAT tag, based on the importance of the systems, resend the alarm to make the appropiate people aware that the problem is still not fixed.

Regards
Johann
list Anatoli Bogajewski · Fri, 16 Mar 2007 11:33:59 +0100 ·
Hey guys,

thanks all for input.

So the initial question seems not be answered yet, but for the moment i am using workaround Something like Gary suggested. REPEAT=1d works fine, default would be 30 min. No DURATION Statement made. Anyway, it would be interesting to know definitely  DURATION is counted from each status change or from going into whatever alert state. At the moment i would guess DURATION count is not cleared in the way like REPEAT interval does.

2007-03-13 14:42:49 Severity increased, cleared repeat interval: myhost/disk yellow->red

So the very last question, is this a bug or feature?

Cheers,
Anatoli 
user-4d8222bd9f10@xymon.invalid schrieb am 14.03.2007 23:29:52:
quoted from Manny Cortes
We use DURATION in our case as a way to escalate notifications to another group of recipients 15 minutes after the initial event occurred in hobbit. The initial alert goes to our onsite Operations folks then after 15 minutes, a custom script fires off that informs all in that particular recipient group that the event is still ongoing and it is being escalated.

    so: qpage pages Operations as soon as the event occurs and they monitor the event
         DURATION>15: the second script fires off.

Working pretty well so far....

Could REPEAT be used for further escalation? Or will another DURATION>30 suffice?

My 2 cents :)

Manny

   -----Original Message-----    From: Gary Baluha [mailto:user-ae3e15c22de1@xymon.invalid]    Sent: Wed 3/14/2007 3:13 PM    To: user-ae9b8668bcde@xymon.invalid    Cc:    Subject: Re: [hobbit] hobbit-alerts.cfg - DURATION


   Now that I think of it, if the goal is just to have the alert send an email once, you probably just want to remove the REPEAT= part (not sure if there is a default for this), or optionally change
it to something like REPEAT=1d.  In that case, the DURATION isn't 
needed. 


   On 3/14/07, Larry Barber <user-6ef9c2864140@xymon.invalid> wrote: 
      I think you have the inequality backwards on your DURATION clause, as it is written no alert will be issued for for alerts that
are older than 3 minutes, probably should be DURATION>3, not DURATION<3.

      Thanks,       Larry Barber 


      On 3/14/07, user-4d3800b5a33a@xymon.invalid < cits.
user-e02353ed9ef3@xymon.invalid 
quoted from Manny Cortes
wrote: 
         Hello,

         thanks for ur reply. 
         user-ae3e15c22de1@xymon.invalid schrieb am 13.03.2007 16:53:25:
it should be counting the time from when the alert changes 
status 
(so, green-to-yellow, yellow-to-red, etc) 
         thought so
Try using the bbcmd "hobbitd_alert" test below to see if it 
is
working as intended.  It can be used as below:
/var/hobbit/server/bin/bbcmd hobbitd_alert --test <hostname> <host test> 
         works in principle as expected, although there is no 
possibility to
         reproduce my scenario using test utility
Also, you might want to consider using DURATION<3m 
(specifying "m"
for minutes).  I'm not sure what the default is, but I 
personally 
prefer to be explicit; makes reading it a little easier as 
well.
         from man pages: "The duration is specified as a number, 
_optionally_
         followed by 'm' (minutes, default), 'h' (hours) or 'd' (days)." 
         --debug output of hobbitd_alert looks like:

         (initial alert yellow)

         2007-03-13 14:38:58 hobbitd_alert: Got message 1139

         @@page#1139|1173793138.770212|xx.xx.xx.xx|myhost|disk|xx.
xx.xx.xx|1173794938|yellow|green|1173793138|pct|643201|||          2007-03-13 14:38:58 startpos 2590, fillpos 2590, endpos -1
quoted from Manny Cortes
         2007-03-13 14:38:58 Got page message from myhost:disk
         2007-03-13 14:38:58 Alert status changed from 0 to 1
         2007-03-13 14:38:58 Found a first matching rule          2007-03-13 14:38:58 No more secondary matching rule
         2007-03-13 14:38:58 1 alerts to go
         2007-03-13 14:38:58 Found a first matching rule
         2007-03-13 14:38:58 send_alert myhost:disk state 0
         2007-03-13 14:38:58 No more secondary matching rule          2007-03-13 14:38:58 Want msg 1140, startpos 2590, fillpos 2590, endpos -1,
         usedbytes=0, bufleft=263649
         2007-03-13 14:38:58 Found a first matching rule
         2007-03-13 14:38:58   repeat myhost|disk|script|0123456789 at 0 
         2007-03-13 14:38:58   Alert for myhost:disk to 0123456789
         2007-03-13 14:38:58 Opening file 
/opt/hobbit/server/etc/bb-hosts
         (4min later red alert raises)

         2007-03-13 14:42:49 hobbitd_alert: Got message 1223          @@page#1223|1173793369.998387|xx.xx.xx.xx|myhost|disk|xx.
xx.xx.xx|1173795169|red|yellow|1173793369|pct|643201|||
quoted from Manny Cortes
         2007-03-13 14:42:49 startpos 47243, fillpos 47243, endpos -1
         2007-03-13 14:42:49 Got page message from myhost:disk          2007-03-13 14:42:49 Severity increased, cleared repeat 
interval:
         myhost/disk yellow->red
         2007-03-13 14:42:49 Found no first matching rule
         2007-03-13 14:42:49 Want msg 1224, startpos 47243, fillpos 47243, endpos          -1, usedbytes=0, bufleft=218996

         so hm. i am not sure i got any lines of interest, but this looks not very
         helpfull.

         Chears,
         Anatoli

Dear Hobbits,

is DURATION keyword within hobbit-alerts.cfg relates to the time period
one test is in a special state, yellow or red, or more general the time
period since non-green state occurs. In example, i want to get exact one
notification at yellow state and one at red, but the 
following 
configuration does not work. I get notified on initial yellow alert, but
not on red one occurring 4 min later.

HOST=myhost SERVICE=disk COLOR=yellow DURATION<3
SCRIPT $SSMSS $ABSMS REPEAT=5 RECOVERED          >
HOST=myhost SERVICE=disk COLOR=red DURATION<3
SCRIPT $SSMSS $ABSMS REPEAT=5 RECOVERED

Any ideas? Thanks :-)

Mit freundlichen Grüßen / Yours sincerely

Anatoli Bogajewski
This e-mail message and any attached files are confidential and are intended solely for the use of the addressee(s) named above. If you are not the intended recipient, any review, use, or distribution of this e-mail message and any attached files is strictly prohibited. This communication may contain material protected by Federal privacy
regulations, attorney-client work product, or other privileges. If you have received this confidential communication in error, please notify the sender immediately by reply e-mail message and permanently delete the original message.  To reply to our email administrator directly, send an email to: user-ecde3bbc361d@xymon.invalid .  If this e-mail message concerns a contract matter, be advised that no employee or agent is authorized to conclude any binding agreement on behalf of Orlando Regional Healthcare by e-mail without express written confirmation by an officer of the corporation. Any views or opinions presented in this e-mail are solely those of the author and do not necessarily represent those of Orlando Regional Healthcare.