Xymon Mailing List Archive search

Debian 9 stretch clients - trends netstat graph bug?

list Japheth Cleaver
Mon, 18 Mar 2019 10:30:22 -0700
Message-Id: <user-f4a9bb6f1159@xymon.invalid>

This is not just a Debian thing -- it looks like it results from output 
changes in recent versions of netstat/net-tools (recent Fedora is 
affected too).

I think the easiest fix will be to move to PCRE parsing to catch the 
differing versions, which is how we're handling the BSDs and some of the 
esoteric unicies. (Although there's a legacy RHEL3 Linux OS type from 
back in the day, this is one of the things we'll want to catch centrally 
moving forward.)

Additionally, it's probably time to start adding 'ss' output into the 
client for future use.

Regards,
-jc

On 3/13/2019 7:53 AM, SebA wrote:
Hi John,

I can confirm that we have the same issue on Debian 9 clients on the 
Network I/O (as labelled) graph.

Kind regards,

SebA


On Tue, 12 Mar 2019 at 22:48, John Horne <user-e95f1ec2f147@xymon.invalid 
<mailto:user-e95f1ec2f147@xymon.invalid>> wrote:

    Hi,

    We have three Debian 9 (stretch) servers running the xymon-client
    package
    (version 4.3.28-2). I noticed yesterday that the 'trends' column
    for these
    clients was showing a 'TCP/IP statistics' graph, but was only
    showing values
    for the 'In' graph. Both the 'Out' and 'Retrans' values were NaN.

    If anyone else is running the Xymon client on a Debian 9 server,
    or may be even
    Debian 8, could they check their 'trends' graphs and see if the
    same problem
    exists for them. Thanks.

    As far as I can tell this started when we upgraded the client
    server O/S from
    Debian 7 to 9 (a few months ago now!). It seems there were values
    for all 3
    graph lines when they ran Debian 7.

    The RRD netstat file, which is used for the statistics graph,
    shows a 'U' for
    the 'Out' and 'Retrans' values. Looking at the actual code (in
    xymond/rrd/do_netstat.c), it seems that the netstat output is
    expected to be
    the same for most Linux distributions and versions. It shows:

    =========
    /* This one matches all Linux systems */
    static char *netstat_linux_markers[] = {
            "packets received",
            "packets sent",
            "packet receive errors",
            "active connections openings",
            "passive connection openings",
            "failed connection attempts",
            "connection resets received",
            "connections established",
            "",
            "",
            "",
            "",
            "segments send out",    /* Yes, they really do write "send" */
            "segments received",
            "",
            "segments retransmited",
            NULL
    };
    =========

    However, the netstat output collected by the clients shows,
    looking at the TCP
    section:

    =========
    Tcp:
        3045575 active connection openings
        251770 passive connection openings
        9335 failed connection attempts
        4520 connection resets received
        37 connections established
        1359715245 segments received
        1330630207 segments sent out
        119457 segments retransmitted
        10 bad segments received
        32339 resets sent
        InCsumErrors: 2
    =========


    Some things seem to be wrong:

    1) The code looks for 'active connections openings', but netstat
    shows 'active
    connection openings'. Singular on the 'connection' word.

    2) The code looks for 'segments send out', but netstat shows
    'segments sent
    out'. So despite the comment in the code, the output uses 'sent'
    rather than
    'send'.

    3) The code looks for 'segments retransmited', but the netstat
    output shows
    'segments retransmitted'. So there is a double 't' in
    retransmitted (or 3 all
    together).

    4) The order of the 'segments' lines is different from the code,
    but I'm not
    sure if that is important. (Haven't looked at the code that much
    in depth.)


    I'm not sure if this is the cause of the TCP/IP stats graph not
    showing values,
    but it doesn't seem right.


    Thanks,

    John.

    --
    John Horne | Senior Operations Analyst | Technology and
    Information Services
    University of Plymouth | Drake Circus | Plymouth | Devon | PL4 8AA
    | UK
    [http://www.plymouth.ac.uk/images/email_footer.gif]<http://www.plymouth.ac.uk/worldclass>;

    This email and any files with it are confidential and intended
    solely for the use of the recipient to whom it is addressed. If
    you are not the intended recipient then copying, distribution or
    other use of the information contained is strictly prohibited and
    you should not rely on it. If you have received this email in
    error please let the sender know immediately and delete it from
    your system(s). Internet emails are not necessarily secure. While
    we take every care, University of Plymouth accepts no
    responsibility for viruses and it is your responsibility to scan
    emails and their attachments. University of Plymouth does not
    accept responsibility for any changes made after it was sent.
    Nothing in this email or its attachments constitutes an order for
    goods or services unless accompanied by an official order form.