I had a similar problem and what helped me track it down was the --cfid option to xymond_alert. I found the rule I thought was firing wasn't the rule that was actually firing - it was a catch-all that I'd left in by mistake.
Hope that helps.
Carl
Carl Inglis
Systems Administrator
Rakon UK Limited
Dowsett House, Sadler Road, Lincoln LN6 3RS, United Kingdom
Tel: +XX XXXX XXXXXX | Fax: +XX XXXX XXXXXX | Mob: +44 7786 552915
user-96685bdc864b@xymon.invalid | www.rakon.com
THE QUEENS AWARDS FOR ENTERPRISE 2012
Rakon UK Ltd
This message together with any attachments contains confidential information and may be
subject to privilege. If you are not the intended recipient you may not distribute it in any
way, you must notify the sender immediately and delete any copies of the message along
with its attachments.
Rakon UK Ltd is a limited company registered in England and Wales.
Registered Office: Dowsett House, Sadler Road, Lincoln LN6 3RS
Company Registration Number: 5128090.
Please be aware that Rakon UK Limited may monitor email traffic data including the date, time, subject line, sender and recipients for the purposes of security and usage monitoring. Automated monitoring systems may also be applied to ascertain whether incoming/outgoing emails are likely to contain viruses, other destructive devices or inappropriate content.
-----Original Message-----
From: Xymon [mailto:xymon-bounces at xymon.com] On Behalf Of Kevin VerMeer
Sent: 07 April 2014 16:23
To: J.C. Cleaver
Cc: xymon at xymon.com
Subject: Re: [Xymon] question on DURATION and DOWNSECS in alerts
We are just starting with our Xymon setup, so I pretty much assume any issues are on our/my end.
We are running 4.3.13.
It does not appear to be waiting 15m before sending.
The default is minutes, correct?
-----Original Message-----
From: J.C. Cleaver [mailto:user-87556346d4af@xymon.invalid]
Sent: Monday, April 07, 2014 10:01 AM
To: Kevin VerMeer
Cc: xymon at xymon.com
Subject: Re: [Xymon] question on DURATION and DOWNSECS in alerts
On Mon, April 7, 2014 5:55 am, Kevin VerMeer wrote:
I have a question on what I am seeing in some of the alerts being
generated.
I have a xymon alert script set up for connection events.
The alert.cfg entry is:
HOST=* SERVICE=conn COLOR=red
SCRIPT /usr/local/xymonutil/alertscripts/noconnectivity.sh
DURATION>15 REPEAT=15 RECOVERED
To me that says the script should only be invoked if connectivity is
down for 15 minutes, repeat every 15 if still down, and one final time
when connectivity is back up.
Within the script that gets invoked, a message gets creates like this:
MSG="Xymon is reporting no connectivty to $STATION.\n Current time:
$DATE.\n Number of seconds down: $DOWNSECS. \n Time down: $TIMEDOWN. \n"
And one MSG that is generated is
Xymon is reporting no connectivty to sta10143.
Current time: 04/06/14 18:00:55.
Number of seconds down: 60.
Time down: 00:01:00.
I would have expected the $DOWNSECS variable to be the total time
down, even including the original 15 minute DURATION.
Is that thinking correct? Or does DOWNSECS only include the time down
after the DURATION kicks in?
Or am I off base on something else?
Kevin,
We're doing something similar and - AFAIK - $DOWNSECS is indeed the total duration of the incident (although this may be subject to any flap detection that's enabled).
Q's:
What version are you running?
and Does it actually wait 15m before sending the first message?
Regards,
-jc