Xymon Mailing List Archive search

False red alerts for disk

list Patrik Nilsson
Thu, 11 Jun 2009 11:40:40 +0200
Message-Id: <user-0bc02f77e1b8@xymon.invalid>

I am now also seeing this with memory reports. There seem to be a
general but intermittent parsing error of client data.

T 2009][uname]
Linux tc1.jalbum.net 2.6.18-92.1
22.el5xen ]86_64 - Memory CRITICAL
   Memory              Used       Total  Percentage
red Physical          48576M          1M    4857600%
red Actual              819M          1M      81900%
green Swap                 80M       1983M          4%

Notice the messed up brackets.

The corresponing part of the actual client data reported is:

client tc1,hostnamechanged,net.linux linux
[date]
Thu Jun 11 11:31:36 CEST 2009
[uname]
Linux tc1.hostnamechanged.net 2.6.18-92.1.22.el5xen x86_64
[osversion]
CentOS release 5.2 (Final)
[uptime]
 11:31:36 up 26 days, 22:25,  1 user,  load average: 0.12, 0.10, 0.03
[who]
root     xvc0         May 15 13:09
[df]
Filesystem         1024-blocks      Used Available Capacity Mounted on
/dev/mapper/VolGroup00-LogVol00  10102072   5553636   4131628      58% /
/dev/xvda1              101086     20724     75143      22% /boot
[mount]
/dev/mapper/VolGroup00-LogVol00 on / type ext3 (rw)
proc on /proc type proc (rw)
sysfs on /sys type sysfs (rw)
devpts on /dev/pts type devpts (rw,gid=5,mode=620)
/dev/xvda1 on /boot type ext3 (rw)
tmpfs on /dev/shm type tmpfs (rw)
none on /proc/sys/fs/binfmt_misc type binfmt_misc (rw)
sunrpc on /var/lib/nfs/rpc_pipefs type rpc_pipefs (rw)
192.168.8.8:/mnt/share on /share type nfs (rw,addr=192.168.8.8)
[free]
             total       used       free     shared    buffers     cached
Mem:       1048576    1043172       5404          0       1936     201892
-/+ buffers/cache:     839344     209232
Swap:      2031608      82368    1949240
[ifconfig]

Patrik

On Wed, Jun 10, 2009 at 4:57 PM, Patrik Nilsson<user-f78fa12d6274@xymon.invalid> wrote:
Hi,

Running Xymon 4.3.0-0.beta2, I sometimes gets false red alerts from
disk on a few servers (One of the servers is the xymon server itself).

Usually disk status is reported green, as this:

Wed Jun 10 16:29:17 CEST 2009 - Filesystems OK

Filesystem         1024-blocks      Used Available Capacity Mounted on
/dev/sda1            204603376   1616748 192593380       1% /

But occasionally, I get red alerts, like this:

- Filesystems NOT ok

red 192593256       1% / (1616872% used) has reached the PANIC level (95%)

Filesystem         1024-blocks
Use] Available Capacity Mounted on
/dev/sda1            204603376   1616872 192593256       1% /

Somehow the parsing of the client data doesn't work right, resulting
the disk blocks being interpreted as percent used.

The corresponding df part in the actual client report looks like this:

 [df]
Filesystem         1024-blocks      Used Available Capacity Mounted on
/dev/sda1            204603376   1616872 192593256       1% /


On another server, the false red alert looks like this:
Wed Jun 10 15:51:53 CEST 2009 - Filesystems NOT ok

red 44% / (2778580% used) has reached the PANIC level (95%)
red 6% /home (2167204% used) has reached the PANIC level (95%)

Filesystem         1
24-]locks      Used Available Capacity Mounted on
/dev/xvda2             5162828   2121988   2778580      44% /
/dev/xvda3             24
7244  ] 136744   2167204       6% /home

While it usually looks like this:
 Wed Jun 10 15:56:54 CEST 2009 - Filesystems OK

Filesystem         1024-blocks      Used Available Capacity Mounted on
/dev/xvda2             5162828   2122012   2778556      44% /
/dev/xvda3             2427244    136784   2167164       6% /home


Slightly different, but once again, blocks used being interpreted as
percentage used.

Anyone has an idea of what might be causing this?

Thanks,

Patrik Nilsson