This is a somewhat old post, but I'm responding anyway ...
In <user-bac20ebe1220@xymon.invalid> Steve Holmes <user-ec1bf77b1b44@xymon.invalid> writes:
Please see below, there is a problem with disk monitoring on one of the
server. Can some one tell me if I did something wrong?
W]d Jul 28 10:34:31 EDT 2010 - Filesystems NOT ok
7% / (8816628% used) has reached the PANIC level (95%)
38% /u01 (90371708% used) has reached the PANIC level (95%)
Filesystem 10
4-b]ocks Used Available Capacity Mounted on
/dev/sda9 9920592 591896 8816628 7% /
/dev/sda10 152435112 54195172 90371708 38% /u01
/dev/sda8 9920592 154056 9254468 2% /tmp
It appears that Xymon has slipped one field to the left in parsing the df
output. The string at the beginning of each of the lines before the actual
df ouput should be the name of the filesystem (plus an icon, but we'll
ignore that for now). Then it is using the available number as the percent
used, which, of course, is huge.
I don't know if this is causing the problem but there is some funkiness with
the first line of the df output. It is broken between the 10 and the 4 and
there is a ']' instead of an 'l' in the word "blocks". Maybe this is a
cut/paste error, but if not, it is certainly not right.
There is a bug somewhere in the Xymon 4.3.0-beta code with the "df"
status handling. I've seen it cause random RRD files to appear for
systems that don't have such filesystems, and occasionally it would
also result in this behaviour where a disk status goes wild.
I haven't been able to nail it yet, mostly because it seems to happen
very rarely and completely without any pattern. It would seem like
some sort of memory corruption problem, but I've had the client-message
handler running for days with valgrind (memory access checker) enabled,
and it came up with nothing.
Very annoying.
Regards,
Henrik