On Fri, 2005-06-17 at 08:01 +0200, Henrik Stoerner wrote:
Something like
HOST=%(www.*).foo.com TEST=http COLOR=red COUNT>=5
MAIL user-3aaf2ac8399f@xymon.invalid
The "COUNT>=5" would then cause this rule to trigger only if there
were 5 or more hosts named www.*.foo.com, whose http tests are red.
You could even combine this with other criteria, say have a threshold of
5 during the daytime, and 10 during off-hours.
I can foresee a problem in handling recovery-notifications for this kind
of alerts, but that's something I'll have to think about.
Would that be useful ?
The main place I would use it would be NTP alerts. If one router loses
NTP, I'm not terribly worried. If 10-20 of them all fail at once then I
know there is something really bad happening... Maybe both GPS clocks
lost sync and all 4 cesium backups failed, or ntp locked up on a core
router and I need to make fewer down-stream nodes dependent on that one.
I would also consider using it for purple alerts. I don't want
individual purples for most of my stuff, but if there are a lot of them
(>100) then I know I killed mrtg and I should page on that.
--
Daniel J McDonald, CCIE # 2495, CNX
Austin Energy
user-290ce4e24e19@xymon.invalid