Xymon forgetting acks at reboot
list Tom Diehl
Hi, I am running the 4.3.10 rpms from terabithia.org on a Centos 6 machine. Every time I reboot the machine xymon forgets all of the services that are either acked or disabled. Is this expected behavior and if not how do I go about troubleshooting it? Regards, Tom user-dcee455aaab0@xymon.invalid Spamtrap address user-4d123f9c385b@xymon.invalid
list Steve Holmes
Firstly, where is the ack directory? If it is in /tmp that would explain them disappearing. Steve Wherever you go, there you are.
▸
On Mar 16, 2013, at 12:05 AM, user-dcee455aaab0@xymon.invalid wrote:
Hi, I am running the 4.3.10 rpms from terabithia.org on a Centos 6 machine. Every time I reboot the machine xymon forgets all of the services that are either acked or disabled. Is this expected behavior and if not how do I go about troubleshooting it? Regards, Tom user-dcee455aaab0@xymon.invalid Spamtrap address user-4d123f9c385b@xymon.invalid
list Japheth Cleaver
(moved to bottom posting)
▸
On Mar 16, 2013, at 12:05 AM, user-dcee455aaab0@xymon.invalid wrote:Hi, I am running the 4.3.10 rpms from terabithia.org on a Centos 6 machine. Every time I reboot the machine xymon forgets all of the services that are either acked or disabled. Is this expected behavior and if not how do I go about troubleshooting it? Regards, Tom user-dcee455aaab0@xymon.invalid Spamtrap address user-4d123f9c385b@xymon.invalidFirstly, where is the ack directory? If it is in /tmp that would explain them disappearing. Steve
Actually, it might be simpler than that... :/ Long story short, there's a $XYMONRUNDIR patched in pointing to (/var/run/xymon/) to differentiate certain daemon-controlled files for SELinux purposes. Sockets and pid files are fine there, but checkpoint files shouldn't be if they're expected to survive a reboot/crash (which they are). xymond state will be recreated relatively soon, but the alert checkpoint is gone and would be missed. Kind of surprised I hadn't caught this, or it hadn't been reported by now... Quick fix: Double check the --checkpoint-file and --restart options for [xymond] and [alert] in /etc/xymon/tasks.cfg -- change them to point to $XYMONTMP (default: /var/lib/xymon/tmp/ in the RPM) instead of $XYMONRUNDIR. This is almost certainly a packaging issue; I'll post a note on this on the site and get a new RPM out soon. Thanks, -jc
list Tom Diehl
▸
On Sat, 16 Mar 2013, user-87556346d4af@xymon.invalid wrote:
(moved to bottom posting)On Mar 16, 2013, at 12:05 AM, user-dcee455aaab0@xymon.invalid wrote:Hi, I am running the 4.3.10 rpms from terabithia.org on a Centos 6 machine. Every time I reboot the machine xymon forgets all of the services that are either acked or disabled. Is this expected behavior and if not how do I go about troubleshooting it? Regards, Tom user-dcee455aaab0@xymon.invalid Spamtrap address user-4d123f9c385b@xymon.invalidFirstly, where is the ack directory? If it is in /tmp that would explain them disappearing. SteveActually, it might be simpler than that... :/ Long story short, there's a $XYMONRUNDIR patched in pointing to (/var/run/xymon/) to differentiate certain daemon-controlled files for SELinux purposes. Sockets and pid files are fine there, but checkpoint files shouldn't be if they're expected to survive a reboot/crash (which they are). xymond state will be recreated relatively soon, but the alert checkpoint is gone and would be missed. Kind of surprised I hadn't caught this, or it hadn't been reported by now... Quick fix: Double check the --checkpoint-file and --restart options for [xymond] and [alert] in /etc/xymon/tasks.cfg -- change them to point to $XYMONTMP (default: /var/lib/xymon/tmp/ in the RPM) instead of $XYMONRUNDIR.
FWIW, I noticed that there was a symlink in /var/lib/xymon/tmp that looks like the following: (bugs pts8) # ll /var/lib/xymon/tmp/xymond.chk lrwxrwxrwx. 1 root root 25 Sep 5 2012 /var/lib/xymon/tmp/xymond.chk -> /var/run/xymon/xymond.chk (bugs pts8) # Restarting xymon after modifying /etc/xymon/tasks.cfg nuked the symlink and wrote the checkpoint file to /var/lib/xymon/tmp. In addition, I found that --checkpoint-file under alert was already set to $XYMONTMP/alert.chk.
▸
This is almost certainly a packaging issue; I'll post a note on this on the site and get a new RPM out soon.
I am going to reboot the machine later today. Will let you know is this solves the issue. Thanks for looking into this. I really appreciate the work you do on the xymon rpms. Regards, -- Tom user-dcee455aaab0@xymon.invalid Spamtrap address user-4d123f9c385b@xymon.invalid