On 6/7/2017 1:29 PM, Asif Iqbal wrote:
On Wed, Jun 7, 2017 at 4:26 PM, Ryan Novosielski <user-46c89e614701@xymon.invalid <mailto:user-46c89e614701@xymon.invalid>> wrote:
I stop the MTA sometimes when I know this is about to happen.
In our case, we do not see a pattern when the firewall is crashing.
From: Xymon <xymon-bounces at xymon.com
<mailto:xymon-bounces at xymon.com>> on behalf of Asif Iqbal
<user-6f4b51ac2a40@xymon.invalid <mailto:user-6f4b51ac2a40@xymon.invalid>>
Sent: Wednesday, June 7, 2017 4:14:43 PM
To: xymon at xymon.com <mailto:xymon at xymon.com>
Subject: [Xymon] Avoid email storm
xymon is behind a firewall and lately firewall is dying a lot and
at restore we are getting email storm about devices that are on
other sites.
firewall team will have a replacement "soon".
In the meantime I guess I should add a depend (first hop) for all
2000+ devices in hosts.cfg file.
Is there another option with less editing?
Thanks for any suggestion!
Flap detection and delaying the alert might help, but delayred= (hosts.cfg(5)) might be a better option for handling this, depending on the number of network tests you have.
If the entire server is behind the FW or you have some other programmatic way of determining what will be on the other side of this FW, I'd suggest adding (either via route or depends) data in regardless (you might be able to get creative with sed or perl to make this addition easier). I've found the more flexibility you can give yourself in describing what you're testing, the more options you'll have for coping with situations like this.
HTH,
-jc