Xymon Mailing List Archive search

question on DURATION and DOWNSECS in alerts

list Kevin VerMeer
Mon, 7 Apr 2014 10:41:41 -0500
Message-Id: <user-91e3f940a5f8@xymon.invalid>

I changed the alert.cfg settings to be 15m instead of 15, to see if that made a difference.
It did not appear to make any difference.  The script was still invoked right away.

-----Original Message-----
From: Kevin VerMeer 
Sent: Monday, April 07, 2014 10:23 AM
To: 'J.C. Cleaver'
Cc: xymon at xymon.com
Subject: RE: [Xymon] question on DURATION and DOWNSECS in alerts

We are just starting with our Xymon setup, so I pretty much assume any issues are on our/my end.
We are running 4.3.13.
It does not appear to be waiting 15m before sending.
The default is minutes, correct?  


-----Original Message-----
From: J.C. Cleaver [mailto:user-87556346d4af@xymon.invalid] 
Sent: Monday, April 07, 2014 10:01 AM
To: Kevin VerMeer
Cc: xymon at xymon.com
Subject: Re: [Xymon] question on DURATION and DOWNSECS in alerts


On Mon, April 7, 2014 5:55 am, Kevin VerMeer wrote:
I have a question on what I am seeing in some of the alerts being 
generated.
I have a xymon alert script set up for connection events.
The alert.cfg entry is:
HOST=* SERVICE=conn COLOR=red
        SCRIPT /usr/local/xymonutil/alertscripts/noconnectivity.sh
DURATION>15 REPEAT=15 RECOVERED

To me that says the script should only be invoked if connectivity is 
down for 15 minutes, repeat every 15 if still down, and one final time 
when connectivity is back up.

Within the script that gets invoked, a message gets creates like this:
MSG="Xymon is reporting no connectivty to $STATION.\n  Current time:
$DATE.\n  Number of seconds down: $DOWNSECS. \n  Time down: $TIMEDOWN. \n"

And one MSG that is generated is
Xymon is reporting no connectivty to sta10143.
 Current time: 04/06/14 18:00:55.
 Number of seconds down: 60.
 Time down: 00:01:00.

I would have expected the $DOWNSECS variable to be the total time 
down, even including the original 15 minute DURATION.
Is that thinking correct?  Or does DOWNSECS only include the time down 
after the DURATION kicks in?
Or am I off base on something else?
Kevin,

We're doing something similar and - AFAIK - $DOWNSECS is indeed the total duration of the incident (although this may be subject to any flap detection that's enabled).

Q's:
What version are you running?
and Does it actually wait 15m before sending the first message?


Regards,
-jc