Xymon Mailing List Archive search

Xymon disruption every night!

list L.M.J
Fri, 29 Jan 2016 08:56:57 +0100
Message-Id: <user-2b9c93897617@xymon.invalid>

Hi,

I'm running Xymon since 6 years (4.3.17 atm) on Debian 7.8 3.2.0-4-amd64
Since 1 month now, every night, between 0h30 or 2h am at +/- 30 min, around 30 hosts become unreachable :

Fri Jan 29 01:16:38 2016 conn NOT ok : DNS lookup failed
Unable to resolve hostname foo.bar.local
System unreachable for 3 poll periods (170 seconds)
green 0.0.0.0 is alive (0.02 ms) [<- 127.0.0.1]


- Got around 500 monitored hosts and looks like the same hosts are lost every single night.
- Those monitored hosts are not necessary on the same network, not the same OS.
- We cross monitored the same hosts and the other monitoring tool doesn't have report the DNS outage.
- I ran a DNS lookup every seconds on the Hobbit server several days and it never reported a DNS outage.
- I don't have any crontab installed on the server who could disturb Xymon.
- Nothing strange in the Xymon logs nor the server logs, no memory leaks or CPU overloaded.
- The rest of the day, Xymon server behavior is normal.
- What I've done on the server 1 month ago ? I don't know, no system upgrade or so.
- I had DNSMASQ acting like a cache, I disabled it : same issue
- /etc/resolv.conf is quite light : search bar.local, next line : nameserver IP.OF.OUR.DNS.SERVER1, just like other servers

The issue could be anywhere : inside or outside the server, Xymon or not... I have to confess, I'm running out of ideas to find the issue, is anyone here may have some leads, I will be thankful !

Have a nice day!