Xymon Mailing List Archive search

weird problem.

list David Baldwin
Wed, 27 Feb 2013 10:47:14 +1100
Message-Id: <user-2c95f3022a32@xymon.invalid>

On 26/02/13 8:26 PM, Adam Goryachev wrote:
On 26/02/13 19:47, Neil Simmonds wrote:
Hi all,

 
I’ve got a strange problem that I’m trying to diagnose and would
appreciate any help you can give.

 
We have 2 new servers that have recently been set up that are Aix
servers running the hobbit client. We have 62 other Aix server with
the same client running absolutely fine.

 
The problem is that the client data is getting cut off mid stream.
It’s always in the ps output. I’ve checked the MAX settings and there
all ok, in fact we have other clients that are sending data files
larger than these that are working fine. I’ve checked the data on the
client and it’s complete but if I look in /xymon/data/hostdata on the
server the data seems to be almost always getting truncated to 69518
bytes. Occasionally a full message (approx 93k) gets through.

 
There are no messages regarding truncated data in the server logs and
the only message I can find on the client is the following,

 
2013-02-26 08:41:21 Write error while sending message to
bbd at xymonserver:1984

2013-02-26 08:41:21 Whoops ! bb failed to send message - write error

 
I’ve googled this extensively and can’t find anything that seems
relevant to our problem.

I get this from time to time, primarily when the xymon host has very
limited bandwidth. It seems to me that Xymon will accept whatever data
has been received prior to the connection being broken/interrupted,
and pretend it is complete (as opposed to discarding it away).
The problem is that there isn't a well defined "end of message" on a
standard client report. The message starts with "client HOSTNAME.OS
CLASS" line then consists of a bunch of sections starting with
"[section]" lines followed by lines of text. When the client has
finished sending its message it just does a shutdown on the write socket
and reads any returned data until EOF. That's it. The server probably
doesn't care if the client even reads the data it sends back, and has no
way of communicating with it anyway.

So if the client connection to the server is interrupted mid-stream, the
server quite probably just handles it as a socket shutdown and accepts
whatever has been received so far as the whole message.
If this is happening frequently/all the time, I would suspect firewall
settings, and/or MTU issues (if it is packet size related). Check that
you are not blocking all ICMP, or that path MTU discovery is working
properly, check any firewall is not timing out or blocking the
connection for some reason, and that there is enough bandwidth for the
messages.

Potentially, a tcpdump at both client and server could be educational,
possibly load these into wireshark for analysis.

PS, I wonder when we will get compression, and/or encryption for the
status messages? Both would assist in making sure the complete message
arrives un-altered...
Indeed. There are other ways of delivering/fetching messages - maybe
worth exploring for more reliable transmission.

David.
Regards,
Adam

-- 
Adam Goryachev
Website Managers
www.websitemanagers.com.au

-- 
David Baldwin - Senior Systems Administrator (Datacentres + Networks)
Information and Communication Technology Services
Australian Sports Commission          http://ausport.gov.au
Tel 02 62147830 Fax 02 62141830       PO Box 176 Belconnen ACT 2616
user-cbbf693f2c89@xymon.invalid          Leverrier Street Bruce ACT 2617


Keep up to date with what's happening in Australian sport visit http://www.ausport.gov.au

This message is intended for the addressee named and may contain confidential and privileged information. If you are not the intended recipient please note that any form of distribution, copying or use of this communication or the information in it is strictly prohibited and may be unlawful. If you receive this message in error, please delete it and notify the sender.