Xymon Mailing List Archive search

possible xymon 4.3.21 holiday alerting bug?

list Japheth Cleaver
Thu, 8 Oct 2015 08:14:03 -0700
Message-Id: <user-9bce680b7815@xymon.invalid>


On Wed, October 7, 2015 11:58 pm, Gavin Stone-Tolcher wrote:
Hi, We are seeing unusual alerting behaviour with Xymon 4.3.21 server
using a "holidays.cfg"  with HOLIDAYLIKEWEEKDAY=0.

We have a network operations team (uqnoc-sms) that gets alerts during
business hours (TIME=W:0800:1700)
And a data networks team (dn-sms) that get out of business hours alerts in
certain windows (TIME=W:0600:0759,W:1701:2200,60:0600:2200)

Rules are like:

PAGE=$UNSMSREGEX EXHOST=$UNEXCLUDE
        MAIL user-760dce092658@xymon.invalid SERVICE=$UNSMSSVCS DURATION>6m
TIME=W:0800:1700 COLOR=red REPEAT=1w FORMAT=SMS RECOVERED
        MAIL user-c6d0660c5139@xymon.invalid SERVICE=$UNSMSSVCS DURATION>6m
TIME=W:0600:0759,W:1701:2200,60:0600:2200 COLOR=red REPEAT=1w
FORMAT=SMS RECOVERED

For a "red" conn test covered by the rule on a weekday public holiday, it
seems to correctly identify not to send an alert to "uqnoc-sms"
(TIME=W:0800:1700 ) and instead correctly generates an alert to "dn-sms"
(TIME=60:0600:2200 component), but then keeps sending the same alert
approximately every minute (my xymonnet poll cycle). Ignores REPEAT=1w?

Before I try and debug much further, I thought I would ask if anyone else
has seen similar behaviour?
Hmm. Does the REPEAT value work with a smaller interval (such as 1d or
1h)? And what type of system are you running on?

I'm curious if there's a REPEAT over/underflow going on instead of
something specific to the TIME exclusion back and forth.

Is the test persistently red with no spurious recoveries being generated
during the period in question?


-jc