Re: Axel Beckert 2016-06-15 <user-0f042641758d@xymon.invalid>
in the past few months I found more and more indices for a strange bug
in (at least) Xymon 4.3.27 which occasionally mixes up hosts when
handling reports:
* Machines with a single disk (e.g. VMs) occassional report status of
a "raid" test which is not deployed to them -- and then (for obvious
reasons) went purple on it. On that server, there's only one machine
in having a RAID, but its "raid" reports have been misassigned to at
least three other hosts, all host which have rather many tests
(compared to a bunch of sensors which send in only very few tests
per host).
[...]
Fwiw, I've seen instances of such behavior ever since I've started
taking care of a hobbit installation at a customer site in late 2007.
Symptoms are randomly mixed up hosts. I can say if there are tests
that are hit more than others, the problem is mostly visible through
disk tests by finding rrd files on disk for partitions that do not
exist on this host.
It doesn't seem to happen constantly, but rather in bursts, but I
don't have hard data on that. My impression was that it only happens
during busy periods, but that could be totally wrong.
We've been on 4.3.0 for a long time until finally upgrading about two
years ago, and I thought the problem was gone then, but what Axel is
describing is exactly what we were (are?) seeing there.
Christoph