Xymon Mailing List Archive search

False alerts in hobbit

list Anna Jonna Armannsdottir
Mon, 10 Nov 2008 14:51:47 +0000
Message-Id: <user-8ca10ea43579@xymon.invalid>

On mán, 2008-11-10 at 18:11 +1100, Adam Goryachev wrote:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

I have been battling false alerts with hobbit for quite some time
(months or more), and am really starting to get quite frustrated.
(Mostly in that I tend to ignore my SMS messages because there are so
many FP's...

Anyway, the fault is that the hobbit client reports get truncated, yet
the hobbit server uses the portion that it gets. This usually results in
the procs, ports, or both columns going red due to non-running
procs/non-open ports. In reality, the proc/port is fine, just the data
was truncated so the hobbit server couldn't find it.

Initially I discovered my hobbit server was truncating some of this
data, so I increased the relevant variables:
MAXLINE="65535"
MAXMSG_STATUS="2048"
MAXMSG_CLIENT="2048"
MAXMSG_DATA="4096"
# Anna added 2008-09-08 because of lots of truncations. 
MAXLINE="32768"
MAXMSG_STATUS="1024"
MAXMSG_DATA="1024"
MAXMSG_CLIENT="2048"
MAXMSG_NOTES="1024"

After I added this, the problem was solved. I found the 
sizes from the truncations reported in the logs. 
However, I still get many red alerts, and when I check, the log files do
not report any truncated or oversized messages. Also, when I examine the
"Client data available" from the red hobbit report, I find the size of
the message is nowhere near any value above, and in fact is always
different... Some reports that work are longer than reports that don't
work etc...
Your logs - do they not report truncated or oversized messages like 
in the following message: 
http://www.hswn.dk/hobbiton/2006/05/msg00176.html 
It isn't 100%, but generally (more than 98%) the clients with the
problem are on bandwidth limited networks.

I would appreciate if anyone can provide any tips on how to make things
more reliable?

Options I have considered:
1) Get hobbit to compress it's data, which reduces network load, and
hence should improve reliability.
2) Add a "END" tag to the hobbit client data, and if the server doesn't
get the END tag then ignore the whole file (or re-request it)
3) Switch to polling mode (which effectively does 1 && 2 I suppose)
4) Try and track down what is causing this, and fix it...

My hobbit server is behind a NAT router, so one possibility I have
considered is the NAT router is dropping the map before the end of the
TCP connection due to too many other connections or similar.
Have you considered setting up a Hobbit proxy. See:
http://www.hswn.dk/hobbiton/2007/06/msg00080.html

-- 
Kindest Regards, Anna Jonna Ármannsdóttir,       %&   A: Because people read from top to bottom.
Unix System Aministration, Computing Services,   %&   Q: Why is top posting bad?
University of Iceland.