Conn test fails after server reboot
list John Horne
Hello, Using Xymon 4.3.7 I have noticed that if I reboot the Xymon server then the 'conn' test fails for all the clients. E.g.: ============================ Thu Jul 12 10:24:11 2012 conn NOT ok Service conn on dns1 is not OK : Host does not respond to ping System unreachable for 5 poll periods (984 seconds) ============================ If, from the server, I run 'ping' to the client then that works fine. So does fping. If I stop then start the Xymon service on the server then the client conn tests all report ok. Any ideas about this? John. -- John Horne Tel: +XX (X)XXXX XXXXXX Plymouth University, UK Fax: +XX (X)XXXX XXXXXX
list Jeremy Laidman
How long did you wait between the reboot and restarting Xymon?
▸
On Thu, Jul 12, 2012 at 7:35 PM, John Horne <user-e95f1ec2f147@xymon.invalid>wrote:
Hello, Using Xymon 4.3.7 I have noticed that if I reboot the Xymon server then the 'conn' test fails for all the clients. E.g.: ============================ Thu Jul 12 10:24:11 2012 conn NOT ok Service conn on dns1 is not OK : Host does not respond to ping System unreachable for 5 poll periods (984 seconds) ============================ If, from the server, I run 'ping' to the client then that works fine. So does fping. If I stop then start the Xymon service on the server then the client conn tests all report ok. Any ideas about this? John. -- John Horne Tel: +XX (X)XXXX XXXXXX Plymouth University, UK Fax: +XX (X)XXXX XXXXXX
list John Horne
▸
On Fri, 2012-07-13 at 14:45 +1000, Jeremy Laidman wrote:
How long did you wait between the reboot and restarting Xymon?
On Thu, Jul 12, 2012 at 7:35 PM, John Horne
<user-e95f1ec2f147@xymon.invalid> wrote:
Using Xymon 4.3.7 I have noticed that if I reboot the Xymon
server then the 'conn' test fails for all the clients. E.g.:
============================
Thu Jul 12 10:24:11 2012 conn NOT ok
Service conn on dns1 is not OK : Host does not respond to ping
System unreachable for 5 poll periods (984 seconds)
============================
If, from the server, I run 'ping' to the client then that
works fine. So does fping. If I stop then start the Xymon
service on the server then the client conn tests all report
ok.
Hello, I have waited various amounts of time, from as soon as I could log in (about a minute or two since rebooting), up to about an hour. I should have added that after a reboot, and when the conn tests are red, then they stay red! Yet the clients are all up and running, and are pingable. At what time I restart Xymon seems to make no difference, once it is done then the tests start to turn green. I can only assume that there is some initial condition which causes the ping to fail, but that it remains in force until Xymon is restarted. Very odd. I will investigate, but am a little lost as to why, say after 5, 10, 60 (!) mins, the tests do not automatically turn green. I added 'trace' to one client in hosts,cfg, and it shows the traceroute working fine but the test is still red and saying the ping failed.
▸
John.
--
John Horne Tel: +XX (X)XXXX XXXXXX
Plymouth University, UK Fax: +XX (X)XXXX XXXXXX
list Steven Carr
What's the ping command set to in your server configuration file? are you using the 'xymonping' command or 'fping'? Make sure that which ever command you are using has the sticky bit set on the actual executable to allow the xymon user to run it. Steve
▸
On 13 July 2012 09:38, John Horne <user-e95f1ec2f147@xymon.invalid> wrote:
On Fri, 2012-07-13 at 14:45 +1000, Jeremy Laidman wrote:How long did you wait between the reboot and restarting Xymon? On Thu, Jul 12, 2012 at 7:35 PM, John Horne <user-e95f1ec2f147@xymon.invalid> wrote: Using Xymon 4.3.7 I have noticed that if I reboot the Xymon server then the 'conn' test fails for all the clients. E.g.: ============================ Thu Jul 12 10:24:11 2012 conn NOT ok Service conn on dns1 is not OK : Host does not respond to ping System unreachable for 5 poll periods (984 seconds) ============================ If, from the server, I run 'ping' to the client then that works fine. So does fping. If I stop then start the Xymon service on the server then the client conn tests all report ok.Hello, I have waited various amounts of time, from as soon as I could log in (about a minute or two since rebooting), up to about an hour. I should have added that after a reboot, and when the conn tests are red, then they stay red! Yet the clients are all up and running, and are pingable. At what time I restart Xymon seems to make no difference, once it is done then the tests start to turn green. I can only assume that there is some initial condition which causes the ping to fail, but that it remains in force until Xymon is restarted. Very odd. I will investigate, but am a little lost as to why, say after 5, 10, 60 (!) mins, the tests do not automatically turn green. I added 'trace' to one client in hosts,cfg, and it shows the traceroute working fine but the test is still red and saying the ping failed. John. -- John Horne Tel: +XX (X)XXXX XXXXXX Plymouth University, UK Fax: +XX (X)XXXX XXXXXX
list John Horne
▸
On Fri, 2012-07-13 at 10:02 +0100, Steven Carr wrote:
What's the ping command set to in your server configuration file? are you using the 'xymonping' command or 'fping'? Make sure that which ever command you are using has the sticky bit set on the actual executable to allow the xymon user to run it.
It is set to use fping. The pathname is correct, and the sticky bit is set. I have run fping from the Xymon server as the xymon user and it works fine: =============================== xymon 17: fping -Ae 141.163.1.250 141.163.177.1 141.163.1.250 is alive (0.43 ms) 141.163.177.1 is alive (0.35 ms)
▸
===============================
John.
--
John Horne Tel: +XX (X)XXXX XXXXXX
Plymouth University, UK Fax: +XX (X)XXXX XXXXXX
list Xymon User in Richmond
▸
On Fri, July 13, 2012 04:38, John Horne wrote:
On Fri, 2012-07-13 at 14:45 +1000, Jeremy Laidman wrote:How long did you wait between the reboot and restarting Xymon? On Thu, Jul 12, 2012 at 7:35 PM, John Horne <user-e95f1ec2f147@xymon.invalid> wrote: Using Xymon 4.3.7 I have noticed that if I reboot the Xymon server then the 'conn' test fails for all the clients. E.g.: ============================ Thu Jul 12 10:24:11 2012 conn NOT ok Service conn on dns1 is not OK : Host does not respond to ping System unreachable for 5 poll periods (984 seconds) ============================ If, from the server, I run 'ping' to the client then that works fine. So does fping. If I stop then start the Xymon service on the server then the client conn tests all report ok.Hello, I have waited various amounts of time, from as soon as I could log in (about a minute or two since rebooting), up to about an hour. I should have added that after a reboot, and when the conn tests are red, then they stay red! Yet the clients are all up and running, and are pingable. At what time I restart Xymon seems to make no difference, once it is done then the tests start to turn green. I can only assume that there is some initial condition which causes the ping to fail, but that it remains in force until Xymon is restarted. Very odd. I will investigate, but am a little lost as to why, say after 5, 10, 60 (!) mins, the tests do not automatically turn green. I added 'trace' to one client in hosts,cfg, and it shows the traceroute working fine but the test is still red and saying the ping failed.
Just a WAG: could Xymon be getting started before the network interfaces and be locked onto localhost as a route, or in some other ambiguous networking state? How's it getting started at boot?
list John Horne
▸
On Thu, 2012-07-12 at 10:35 +0100, John Horne wrote:
Hello, Using Xymon 4.3.7 I have noticed that if I reboot the Xymon server then the 'conn' test fails for all the clients. E.g.: ============================ Thu Jul 12 10:24:11 2012 conn NOT ok Service conn on dns1 is not OK : Host does not respond to ping System unreachable for 5 poll periods (984 seconds) ============================ If, from the server, I run 'ping' to the client then that works fine. So does fping. If I stop then start the Xymon service on the server then the client conn tests all report ok.
Hello,
Sorry, but this turned out to be an SELinux problem. 'fping' is denied
write access to files in the ~/server/tmp directory on the Xymon server.
However, fping records its results in that directory, and Xymon looks at
them to see if a client is alive or not. Since there were no results,
because of SELinux, Xymon figured that all the clients were down.
I have created a local SELinux policy to allow writes for fping and that
seems to work. (I have rebooted the Xymon server and it didn't show any
red ping/conn tests.)
The clients don't use 'fping' so they don't have this problem.
Why did restarting the Xymon service (not the server) allow the tests to
turn green? Not sure.
Thanks for all the replies.
▸
John.
--
John Horne Tel: +XX (X)XXXX XXXXXX
Plymouth University, UK Fax: +XX (X)XXXX XXXXXX
list Jeremy Laidman
▸
On Fri, Jul 13, 2012 at 6:38 PM, John Horne <user-e95f1ec2f147@xymon.invalid>wrote:
I should have added that after a reboot, and when the conn tests are red, then they stay red! Yet the clients are all up and running, and are pingable. At what time I restart Xymon seems to make no difference, once it is done then the tests start to turn green.
This symptom is probably significant, but I can't think what might cause it. Once we know, it will all make sense! Does tcpdump/snoop show the ping packets before the restart of Xymon? J
list Japheth Cleaver
▸
On Thu, 2012-07-12 at 10:35 +0100, John Horne wrote: Hello, Sorry, but this turned out to be an SELinux problem. 'fping' is denied write access to files in the ~/server/tmp directory on the Xymon server. However, fping records its results in that directory, and Xymon looks at them to see if a client is alive or not. Since there were no results, because of SELinux, Xymon figured that all the clients were down. I have created a local SELinux policy to allow writes for fping and that seems to work. (I have rebooted the Xymon server and it didn't show any red ping/conn tests.) The clients don't use 'fping' so they don't have this problem. Why did restarting the Xymon service (not the server) allow the tests to turn green? Not sure.
SELinux policies distinguish between appending, writing, and seeking in many cases. I don't recall the details, but I remember needing to futz with different policies to figure out what was going on as well. Was anything interesting going on in the audit logs at the time? -jc
list John Horne
▸
On Tue, 2012-07-17 at 03:51 -0700, user-87556346d4af@xymon.invalid wrote:
On Thu, 2012-07-12 at 10:35 +0100, John Horne wrote: Hello, Sorry, but this turned out to be an SELinux problem. 'fping' is denied write access to files in the ~/server/tmp directory on the Xymon server. However, fping records its results in that directory, and Xymon looks at them to see if a client is alive or not. Since there were no results, because of SELinux, Xymon figured that all the clients were down. I have created a local SELinux policy to allow writes for fping and that seems to work. (I have rebooted the Xymon server and it didn't show any red ping/conn tests.) The clients don't use 'fping' so they don't have this problem. Why did restarting the Xymon service (not the server) allow the tests to turn green? Not sure.SELinux policies distinguish between appending, writing, and seeking in many cases. I don't recall the details, but I remember needing to futz with different policies to figure out what was going on as well. Was anything interesting going on in the audit logs at the time?
Hi,
Nothing else was going on in the logs at the time that the fpings were
stopped. The log showed that it was a write denial:
=============================
type=AVC msg=audit(1342195229.681:349): avc: denied { write } for
pid=25973 comm="fping"
path="/home/xymon/server/tmp/ping-stderr.25955.00" dev=sdb1 ino=1587865
scontext=system_u:system_r:ping_t:s0
tcontext=system_u:object_r:user_home_t:s0 tclass=file
=============================
Using audit2allow to create a policy allowing writes in 'tmp' solved the
problem.
▸
John.
--
John Horne Tel: +XX (X)XXXX XXXXXX
Plymouth University, UK Fax: +XX (X)XXXX XXXXXX