Xymon Mailing List Archive search

Debian 9 stretch clients - trends netstat graph bug?

8 messages in this thread

list John Horne · Tue, 12 Mar 2019 22:48:12 +0000 ·
Hi,

We have three Debian 9 (stretch) servers running the xymon-client package
(version 4.3.28-2). I noticed yesterday that the 'trends' column for these
clients was showing a 'TCP/IP statistics' graph, but was only showing values
for the 'In' graph. Both the 'Out' and 'Retrans' values were NaN.

If anyone else is running the Xymon client on a Debian 9 server, or may be even
Debian 8, could they check their 'trends' graphs and see if the same problem
exists for them. Thanks.

As far as I can tell this started when we upgraded the client server O/S from
Debian 7 to 9 (a few months ago now!). It seems there were values for all 3
graph lines when they ran Debian 7.

The RRD netstat file, which is used for the statistics graph, shows a 'U' for
the 'Out' and 'Retrans' values. Looking at the actual code (in
xymond/rrd/do_netstat.c), it seems that the netstat output is expected to be
the same for most Linux distributions and versions. It shows:

=========
/* This one matches all Linux systems */
static char *netstat_linux_markers[] = {
        "packets received",
        "packets sent",
        "packet receive errors",
        "active connections openings",
        "passive connection openings",
        "failed connection attempts",
        "connection resets received",
        "connections established",
        "",
        "",
        "",
        "",
        "segments send out",    /* Yes, they really do write "send" */
        "segments received",
        "",
        "segments retransmited",
        NULL
};
=========

However, the netstat output collected by the clients shows, looking at the TCP
section:

=========
Tcp:
    3045575 active connection openings
    251770 passive connection openings
    9335 failed connection attempts
    4520 connection resets received
    37 connections established
    1359715245 segments received
    1330630207 segments sent out
    119457 segments retransmitted
    10 bad segments received
    32339 resets sent
    InCsumErrors: 2
=========


Some things seem to be wrong:

1) The code looks for 'active connections openings', but netstat shows 'active
connection openings'. Singular on the 'connection' word.

2) The code looks for 'segments send out', but netstat shows 'segments sent
out'. So despite the comment in the code, the output uses 'sent' rather than
'send'.

3) The code looks for 'segments retransmited', but the netstat output shows
'segments retransmitted'. So there is a double 't' in retransmitted (or 3 all
together).

4) The order of the 'segments' lines is different from the code, but I'm not
sure if that is important. (Haven't looked at the code that much in depth.)


I'm not sure if this is the cause of the TCP/IP stats graph not showing values,
but it doesn't seem right.


Thanks,

John.

--
John Horne | Senior Operations Analyst | Technology and Information Services
University of Plymouth | Drake Circus | Plymouth | Devon | PL4 8AA | UK
[http://www.plymouth.ac.uk/images/email_footer.gif]<http://www.plymouth.ac.uk/worldclass>;

This email and any files with it are confidential and intended solely for the use of the recipient to whom it is addressed. If you are not the intended recipient then copying, distribution or other use of the information contained is strictly prohibited and you should not rely on it. If you have received this email in error please let the sender know immediately and delete it from your system(s). Internet emails are not necessarily secure. While we take every care, University of Plymouth accepts no responsibility for viruses and it is your responsibility to scan emails and their attachments. University of Plymouth does not accept responsibility for any changes made after it was sent. Nothing in this email or its attachments constitutes an order for goods or services unless accompanied by an official order form.
list Sebastian Auriol · Wed, 13 Mar 2019 14:53:02 +0000 ·
Hi John,

I can confirm that we have the same issue on Debian 9 clients on the
Network I/O (as labelled) graph.

Kind regards,

SebA
quoted from John Horne


On Tue, 12 Mar 2019 at 22:48, John Horne <user-e95f1ec2f147@xymon.invalid> wrote:
Hi,

We have three Debian 9 (stretch) servers running the xymon-client package
(version 4.3.28-2). I noticed yesterday that the 'trends' column for these
clients was showing a 'TCP/IP statistics' graph, but was only showing
values
for the 'In' graph. Both the 'Out' and 'Retrans' values were NaN.

If anyone else is running the Xymon client on a Debian 9 server, or may be
even
Debian 8, could they check their 'trends' graphs and see if the same
problem
exists for them. Thanks.

As far as I can tell this started when we upgraded the client server O/S
from
Debian 7 to 9 (a few months ago now!). It seems there were values for all 3
graph lines when they ran Debian 7.

The RRD netstat file, which is used for the statistics graph, shows a 'U'
for
the 'Out' and 'Retrans' values. Looking at the actual code (in
xymond/rrd/do_netstat.c), it seems that the netstat output is expected to
be
the same for most Linux distributions and versions. It shows:

=========
/* This one matches all Linux systems */
static char *netstat_linux_markers[] = {
        "packets received",
        "packets sent",
        "packet receive errors",
        "active connections openings",
        "passive connection openings",
        "failed connection attempts",
        "connection resets received",
        "connections established",
        "",
        "",
        "",
        "",
        "segments send out",    /* Yes, they really do write "send" */
        "segments received",
        "",
        "segments retransmited",
        NULL
};
=========

However, the netstat output collected by the clients shows, looking at the
TCP
section:

=========
Tcp:
    3045575 active connection openings
    251770 passive connection openings
    9335 failed connection attempts
    4520 connection resets received
    37 connections established
    1359715245 segments received
    1330630207 segments sent out
    119457 segments retransmitted
    10 bad segments received
    32339 resets sent
    InCsumErrors: 2
=========


Some things seem to be wrong:

1) The code looks for 'active connections openings', but netstat shows
'active
connection openings'. Singular on the 'connection' word.

2) The code looks for 'segments send out', but netstat shows 'segments sent
out'. So despite the comment in the code, the output uses 'sent' rather
than
'send'.

3) The code looks for 'segments retransmited', but the netstat output shows
'segments retransmitted'. So there is a double 't' in retransmitted (or 3
all
together).

4) The order of the 'segments' lines is different from the code, but I'm
not
sure if that is important. (Haven't looked at the code that much in depth.)


I'm not sure if this is the cause of the TCP/IP stats graph not showing
values,
but it doesn't seem right.


Thanks,

John.

--
John Horne | Senior Operations Analyst | Technology and Information
Services
University of Plymouth | Drake Circus | Plymouth | Devon | PL4 8AA | UK

[http://www.plymouth.ac.uk/images/email_footer.gif]<;
http://www.plymouth.ac.uk/worldclass>;
quoted from John Horne

This email and any files with it are confidential and intended solely for
the use of the recipient to whom it is addressed. If you are not the
intended recipient then copying, distribution or other use of the
information contained is strictly prohibited and you should not rely on it.
If you have received this email in error please let the sender know
immediately and delete it from your system(s). Internet emails are not
necessarily secure. While we take every care, University of Plymouth
accepts no responsibility for viruses and it is your responsibility to scan
emails and their attachments. University of Plymouth does not accept
responsibility for any changes made after it was sent. Nothing in this
email or its attachments constitutes an order for goods or services unless
accompanied by an official order form.

list Torsten Richter · Wed, 13 Mar 2019 19:29:16 +0100 ·
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1


Hi,

just checked my servers here and I can confirm the issue with Debian 9.
Clients running Debian 8 or 7 don't seem to be affected.

Regards,
Torsten
quoted from Sebastian Auriol

On 13.03.2019 15:53, SebA wrote:
Hi John,

I can confirm that we have the same issue on Debian 9 clients on the
Network I/O (as labelled) graph.

Kind regards,

SebA


On Tue, 12 Mar 2019 at 22:48, John Horne <user-e95f1ec2f147@xymon.invalid> wrote:
Hi,

We have three Debian 9 (stretch) servers running the xymon-client package
(version 4.3.28-2). I noticed yesterday that the 'trends' column for these
clients was showing a 'TCP/IP statistics' graph, but was only showing
values
for the 'In' graph. Both the 'Out' and 'Retrans' values were NaN.

If anyone else is running the Xymon client on a Debian 9 server, or may be
even
Debian 8, could they check their 'trends' graphs and see if the same
problem
exists for them. Thanks.

As far as I can tell this started when we upgraded the client server O/S
from
Debian 7 to 9 (a few months ago now!). It seems there were values for all 3
graph lines when they ran Debian 7.

The RRD netstat file, which is used for the statistics graph, shows a 'U'
for
the 'Out' and 'Retrans' values. Looking at the actual code (in
xymond/rrd/do_netstat.c), it seems that the netstat output is expected to
be
the same for most Linux distributions and versions. It shows:

=========
/* This one matches all Linux systems */
static char *netstat_linux_markers[] = {
        "packets received",
        "packets sent",
        "packet receive errors",
        "active connections openings",
        "passive connection openings",
        "failed connection attempts",
        "connection resets received",
        "connections established",
        "",
        "",
        "",
        "",
        "segments send out",    /* Yes, they really do write "send" */
        "segments received",
        "",
        "segments retransmited",
        NULL
};
=========

However, the netstat output collected by the clients shows, looking at the
TCP
section:

=========
Tcp:
    3045575 active connection openings
    251770 passive connection openings
    9335 failed connection attempts
    4520 connection resets received
    37 connections established
    1359715245 segments received
    1330630207 segments sent out
    119457 segments retransmitted
    10 bad segments received
    32339 resets sent
    InCsumErrors: 2
=========


Some things seem to be wrong:

1) The code looks for 'active connections openings', but netstat shows
'active
connection openings'. Singular on the 'connection' word.

2) The code looks for 'segments send out', but netstat shows 'segments sent
out'. So despite the comment in the code, the output uses 'sent' rather
than
'send'.

3) The code looks for 'segments retransmited', but the netstat output shows
'segments retransmitted'. So there is a double 't' in retransmitted (or 3
all
together).

4) The order of the 'segments' lines is different from the code, but I'm
not
sure if that is important. (Haven't looked at the code that much in depth.)


I'm not sure if this is the cause of the TCP/IP stats graph not showing
values,
but it doesn't seem right.


Thanks,

John.

--
John Horne | Senior Operations Analyst | Technology and Information
Services
University of Plymouth | Drake Circus | Plymouth | Devon | PL4 8AA | UK
[http://www.plymouth.ac.uk/images/email_footer.gif]<;
http://www.plymouth.ac.uk/worldclass>;

This email and any files with it are confidential and intended solely for
the use of the recipient to whom it is addressed. If you are not the
intended recipient then copying, distribution or other use of the
information contained is strictly prohibited and you should not rely on it.
If you have received this email in error please let the sender know
immediately and delete it from your system(s). Internet emails are not
necessarily secure. While we take every care, University of Plymouth
accepts no responsibility for viruses and it is your responsibility to scan
emails and their attachments. University of Plymouth does not accept
responsibility for any changes made after it was sent. Nothing in this
email or its attachments constitutes an order for goods or services unless
accompanied by an official order form.

- -- 
+---------------------------------------------------------+

| E-mail  : user-c862b499d9fa@xymon.invalid			  |
|							  |
| Homepage: http://www.richter-it.net/			  |
+---------------------------------------------------------+
Download my public key from:
http://keys.gnupg.net/pks/lookup?search=0x899093AC&op=get
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2

iEYEARECAAYFAlyJS/wACgkQ7DlmxomQk6y+jwCgkKgffPdRvKSc0uXDYfHJylZ8
Nb0Ani4kczv26B9bsRRRQCuo86mBCtCL
=G4O8
-----END PGP SIGNATURE-----
list Japheth Cleaver · Mon, 18 Mar 2019 10:30:22 -0700 ·
This is not just a Debian thing -- it looks like it results from output 
changes in recent versions of netstat/net-tools (recent Fedora is 
affected too).

I think the easiest fix will be to move to PCRE parsing to catch the 
differing versions, which is how we're handling the BSDs and some of the 
esoteric unicies. (Although there's a legacy RHEL3 Linux OS type from 
back in the day, this is one of the things we'll want to catch centrally 
moving forward.)

Additionally, it's probably time to start adding 'ss' output into the 
client for future use.

Regards,
-jc
quoted from Torsten Richter

On 3/13/2019 7:53 AM, SebA wrote:
Hi John,

I can confirm that we have the same issue on Debian 9 clients on the 
Network I/O (as labelled) graph.

Kind regards,

SebA


On Tue, 12 Mar 2019 at 22:48, John Horne <user-e95f1ec2f147@xymon.invalid 
<mailto:user-e95f1ec2f147@xymon.invalid>> wrote:

    Hi,

    We have three Debian 9 (stretch) servers running the xymon-client
    package
    (version 4.3.28-2). I noticed yesterday that the 'trends' column
    for these
    clients was showing a 'TCP/IP statistics' graph, but was only
    showing values
    for the 'In' graph. Both the 'Out' and 'Retrans' values were NaN.

    If anyone else is running the Xymon client on a Debian 9 server,
    or may be even
    Debian 8, could they check their 'trends' graphs and see if the
    same problem
    exists for them. Thanks.

    As far as I can tell this started when we upgraded the client
    server O/S from
    Debian 7 to 9 (a few months ago now!). It seems there were values
    for all 3
    graph lines when they ran Debian 7.

    The RRD netstat file, which is used for the statistics graph,
    shows a 'U' for
    the 'Out' and 'Retrans' values. Looking at the actual code (in
    xymond/rrd/do_netstat.c), it seems that the netstat output is
    expected to be
    the same for most Linux distributions and versions. It shows:

    =========
    /* This one matches all Linux systems */
    static char *netstat_linux_markers[] = {
            "packets received",
            "packets sent",
            "packet receive errors",
            "active connections openings",
            "passive connection openings",
            "failed connection attempts",
            "connection resets received",
            "connections established",
            "",
            "",
            "",
            "",
            "segments send out",    /* Yes, they really do write "send" */
            "segments received",
            "",
            "segments retransmited",
            NULL
    };
    =========

    However, the netstat output collected by the clients shows,
    looking at the TCP
    section:

    =========
    Tcp:
        3045575 active connection openings
        251770 passive connection openings
        9335 failed connection attempts
        4520 connection resets received
        37 connections established
        1359715245 segments received
        1330630207 segments sent out
        119457 segments retransmitted
        10 bad segments received
        32339 resets sent
        InCsumErrors: 2
    =========


    Some things seem to be wrong:

    1) The code looks for 'active connections openings', but netstat
    shows 'active
    connection openings'. Singular on the 'connection' word.

    2) The code looks for 'segments send out', but netstat shows
    'segments sent
    out'. So despite the comment in the code, the output uses 'sent'
    rather than
    'send'.

    3) The code looks for 'segments retransmited', but the netstat
    output shows
    'segments retransmitted'. So there is a double 't' in
    retransmitted (or 3 all
    together).

    4) The order of the 'segments' lines is different from the code,
    but I'm not
    sure if that is important. (Haven't looked at the code that much
    in depth.)


    I'm not sure if this is the cause of the TCP/IP stats graph not
    showing values,
    but it doesn't seem right.


    Thanks,

    John.

    --
    John Horne | Senior Operations Analyst | Technology and
    Information Services
    University of Plymouth | Drake Circus | Plymouth | Devon | PL4 8AA
    | UK
    [http://www.plymouth.ac.uk/images/email_footer.gif]<http://www.plymouth.ac.uk/worldclass>;

    This email and any files with it are confidential and intended
    solely for the use of the recipient to whom it is addressed. If
    you are not the intended recipient then copying, distribution or
    other use of the information contained is strictly prohibited and
    you should not rely on it. If you have received this email in
    error please let the sender know immediately and delete it from
    your system(s). Internet emails are not necessarily secure. While
    we take every care, University of Plymouth accepts no
    responsibility for viruses and it is your responsibility to scan
    emails and their attachments. University of Plymouth does not
    accept responsibility for any changes made after it was sent.
    Nothing in this email or its attachments constitutes an order for
    goods or services unless accompanied by an official order form.
    
list John Horne · Mon, 18 Mar 2019 18:11:59 +0000 ·
quoted from Japheth Cleaver
On Mon, 2019-03-18 at 10:30 -0700, Japheth Cleaver wrote:
This is not just a Debian thing -- it looks like it results from output
changes in recent versions of netstat/net-tools (recent Fedora is affected
too).

I think the easiest fix will be to move to PCRE parsing to catch the
differing versions, which is how we're handling the BSDs and some of the
esoteric unicies. (Although there's a legacy RHEL3 Linux OS type from back in
the day, this is one of the things we'll want to catch centrally moving
forward.)
I was going to take a look at the problem tomorrow. From my very quick look at
the code I thought it already had support for PCRE, and so hopefully it would
just require changing the code from a fixed string to a regex to cater for the
current and new net-tools versions.


John.
quoted from Japheth Cleaver
On 3/13/2019 7:53 AM, SebA wrote:
Hi John,

I can confirm that we have the same issue on Debian 9 clients on the
Network I/O (as labelled) graph.

Kind regards,

SebA


On Tue, 12 Mar 2019 at 22:48, John Horne <user-e95f1ec2f147@xymon.invalid> wrote:
Hi,

We have three Debian 9 (stretch) servers running the xymon-client package
(version 4.3.28-2). I noticed yesterday that the 'trends' column for
these
clients was showing a 'TCP/IP statistics' graph, but was only showing
values
for the 'In' graph. Both the 'Out' and 'Retrans' values were NaN.

If anyone else is running the Xymon client on a Debian 9 server, or may
be even
Debian 8, could they check their 'trends' graphs and see if the same
problem
exists for them. Thanks.

As far as I can tell this started when we upgraded the client server O/S
from
Debian 7 to 9 (a few months ago now!). It seems there were values for all
3
graph lines when they ran Debian 7.

The RRD netstat file, which is used for the statistics graph, shows a 'U'
for
the 'Out' and 'Retrans' values. Looking at the actual code (in
xymond/rrd/do_netstat.c), it seems that the netstat output is expected to
be
the same for most Linux distributions and versions. It shows:

=========
/* This one matches all Linux systems */
static char *netstat_linux_markers[] = {
        "packets received",
        "packets sent",
        "packet receive errors",
        "active connections openings",
        "passive connection openings",
        "failed connection attempts",
        "connection resets received",
        "connections established",
        "",
        "",
        "",
        "",
        "segments send out",    /* Yes, they really do write "send" */
        "segments received",
        "",
        "segments retransmited",
        NULL
};
=========

However, the netstat output collected by the clients shows, looking at
the TCP
section:

=========
Tcp:
    3045575 active connection openings
    251770 passive connection openings
    9335 failed connection attempts
    4520 connection resets received
    37 connections established
    1359715245 segments received
    1330630207 segments sent out
    119457 segments retransmitted
    10 bad segments received
    32339 resets sent
    InCsumErrors: 2
=========


Some things seem to be wrong:

1) The code looks for 'active connections openings', but netstat shows
'active
connection openings'. Singular on the 'connection' word.

2) The code looks for 'segments send out', but netstat shows 'segments
sent
out'. So despite the comment in the code, the output uses 'sent' rather
than
'send'.

3) The code looks for 'segments retransmited', but the netstat output
shows
'segments retransmitted'. So there is a double 't' in retransmitted (or 3
all
together).

4) The order of the 'segments' lines is different from the code, but I'm
not
sure if that is important. (Haven't looked at the code that much in
depth.)


I'm not sure if this is the cause of the TCP/IP stats graph not showing
values,
but it doesn't seem right.


Thanks,

John.

--
John Horne | Senior Operations Analyst | Technology and Information
Services
University of Plymouth | Drake Circus | Plymouth | Devon | PL4 8AA | UK
[
http://www.plymouth.ac.uk/images/email_footer.gif]<http://www.plymouth.ac.uk/worldclass
This email and any files with it are confidential and intended solely for
the use of the recipient to whom it is addressed. If you are not the
intended recipient then copying, distribution or other use of the
information contained is strictly prohibited and you should not rely on
it. If you have received this email in error please let the sender know
immediately and delete it from your system(s). Internet emails are not
necessarily secure. While we take every care, University of Plymouth
accepts no responsibility for viruses and it is your responsibility to
scan emails and their attachments. University of Plymouth does not accept
responsibility for any changes made after it was sent. Nothing in this
email or its attachments constitutes an order for goods or services
unless accompanied by an official order form.

--
John Horne | Senior Operations Analyst | Technology and Information Services
University of Plymouth | Drake Circus | Plymouth | Devon | PL4 8AA | UK
[http://www.plymouth.ac.uk/images/email_footer.gif]<http://www.plymouth.ac.uk/worldclass>;

This email and any files with it are confidential and intended solely for the use of the recipient to whom it is addressed. If you are not the intended recipient then copying, distribution or other use of the information contained is strictly prohibited and you should not rely on it. If you have received this email in error please let the sender know immediately and delete it from your system(s). Internet emails are not necessarily secure. While we take every care, University of Plymouth accepts no responsibility for viruses and it is your responsibility to scan emails and their attachments. University of Plymouth does not accept responsibility for any changes made after it was sent. Nothing in this email or its attachments constitutes an order for goods or services unless accompanied by an official order form.
list David · Mon, 18 Mar 2019 15:16:09 -0700 ·
quoted from John Horne
On 3/18/19 11:11 AM, John Horne wrote:
On Mon, 2019-03-18 at 10:30 -0700, Japheth Cleaver wrote:
This is not just a Debian thing -- it looks like it results from output
changes in recent versions of netstat/net-tools (recent Fedora is affected
too).

I think the easiest fix will be to move to PCRE parsing to catch the
differing versions, which is how we're handling the BSDs and some of the
esoteric unicies. (Although there's a legacy RHEL3 Linux OS type from back in
the day, this is one of the things we'll want to catch centrally moving
forward.)
I was going to take a look at the problem tomorrow. From my very quick look at
the code I thought it already had support for PCRE, and so hopefully it would
just require changing the code from a fixed string to a regex to cater for the
current and new net-tools versions.


John.
I think it's a bigger issue with the netstat command not getting as much 
information any longer.

Digging back for old logs, what was generating outbound and retransmits 
is shown here:


[date]
Mon Jun 18 09:14:39 PDT 2018
...
[netstat]
Ip:
     35434 total packets received
     6 with invalid addresses
     0 forwarded
     0 incoming packets discarded
     31540 incoming packets delivered
     31014 requests sent out
     300 outgoing packets dropped
     30 dropped because of missing route
Icmp:


And the output received currently:

[date]
Mon Mar 18 01:23:29 PDT 2019
...
[netstat]
Ip:
     Forwarding: 2
     20425327 total packets received
     25 with invalid addresses
     0 forwarded
     0 incoming packets discarded
     20335050 incoming packets delivered
     6916589 requests sent out
Icmp:


Clearly, the new tools are NOT collecting the data we need to parse, so 
work on changing anything other than the tools directly, or finding 
another tool to use, would be pointless.

david
list John Horne · Thu, 21 Mar 2019 11:57:16 +0000 ·
quoted from Japheth Cleaver
On Mon, 2019-03-18 at 10:30 -0700, Japheth Cleaver wrote:
This is not just a Debian thing -- it looks like it results from output
changes in recent versions of netstat/net-tools (recent Fedora is affected
too).

I think the easiest fix will be to move to PCRE parsing to catch the
differing versions, which is how we're handling the BSDs and some of the
esoteric unicies.
Hi,

I would have agreed, but as far as I can tell the code looks for a specific
order for the data when a regex is used. This is fine for *BSD/UNIX, but is not
valid for Linux. So it cannot use a regex (without changing yet more code).

I thought about defining a new OS type other than 'LINUX' and 'LINUX22', but
felt that the type may well be used in other parts of Xymon. So adding a new
Linux type may cause problems elsewhere.

In the end I made a relatively simple change to just one file (do_netstat.c).
For Linux systems it checks the netstat output for one of the typo strings. If
it is found, then it uses the current code. If it is not found, then it uses
the new set of strings to check the netstat output. (Basically the new set is
the same as the old one, but with the typos corrected.)

I checked the net-tools source and it seems that all the typos affecting Xymon
were fixed in one commit. So checking for just one typo in the code should work
fine.

Patch file attached.
quoted from John Horne


John.
On 3/13/2019 7:53 AM, SebA wrote:
Hi John,

I can confirm that we have the same issue on Debian 9 clients on the
Network I/O (as labelled) graph.

Kind regards,

SebA


On Tue, 12 Mar 2019 at 22:48, John Horne <user-e95f1ec2f147@xymon.invalid> wrote:
Hi,

We have three Debian 9 (stretch) servers running the xymon-client package
(version 4.3.28-2). I noticed yesterday that the 'trends' column for
these
clients was showing a 'TCP/IP statistics' graph, but was only showing
values
for the 'In' graph. Both the 'Out' and 'Retrans' values were NaN.

If anyone else is running the Xymon client on a Debian 9 server, or may
be even
Debian 8, could they check their 'trends' graphs and see if the same
problem
exists for them. Thanks.

As far as I can tell this started when we upgraded the client server O/S
from
Debian 7 to 9 (a few months ago now!). It seems there were values for all
3
graph lines when they ran Debian 7.

The RRD netstat file, which is used for the statistics graph, shows a 'U'
for
the 'Out' and 'Retrans' values. Looking at the actual code (in
xymond/rrd/do_netstat.c), it seems that the netstat output is expected to
be
the same for most Linux distributions and versions. It shows:

=========
/* This one matches all Linux systems */
static char *netstat_linux_markers[] = {
        "packets received",
        "packets sent",
        "packet receive errors",
        "active connections openings",
        "passive connection openings",
        "failed connection attempts",
        "connection resets received",
        "connections established",
        "",
        "",
        "",
        "",
        "segments send out",    /* Yes, they really do write "send" */
        "segments received",
        "",
        "segments retransmited",
        NULL
};
=========

However, the netstat output collected by the clients shows, looking at
the TCP
section:

=========
Tcp:
    3045575 active connection openings
    251770 passive connection openings
    9335 failed connection attempts
    4520 connection resets received
    37 connections established
    1359715245 segments received
    1330630207 segments sent out
    119457 segments retransmitted
    10 bad segments received
    32339 resets sent
    InCsumErrors: 2
=========


Some things seem to be wrong:

1) The code looks for 'active connections openings', but netstat shows
'active
connection openings'. Singular on the 'connection' word.

2) The code looks for 'segments send out', but netstat shows 'segments
sent
out'. So despite the comment in the code, the output uses 'sent' rather
than
'send'.

3) The code looks for 'segments retransmited', but the netstat output
shows
'segments retransmitted'. So there is a double 't' in retransmitted (or 3
all
together).

4) The order of the 'segments' lines is different from the code, but I'm
not
sure if that is important. (Haven't looked at the code that much in
depth.)


I'm not sure if this is the cause of the TCP/IP stats graph not showing
values,
but it doesn't seem right.


Thanks,

John.

--
John Horne | Senior Operations Analyst | Technology and Information
Services
University of Plymouth | Drake Circus | Plymouth | Devon | PL4 8AA | UK
[
http://www.plymouth.ac.uk/images/email_footer.gif]<http://www.plymouth.ac.uk/worldclass
This email and any files with it are confidential and intended solely for
the use of the recipient to whom it is addressed. If you are not the
intended recipient then copying, distribution or other use of the
information contained is strictly prohibited and you should not rely on
it. If you have received this email in error please let the sender know
immediately and delete it from your system(s). Internet emails are not
necessarily secure. While we take every care, University of Plymouth
accepts no responsibility for viruses and it is your responsibility to
scan emails and their attachments. University of Plymouth does not accept
responsibility for any changes made after it was sent. Nothing in this
email or its attachments constitutes an order for goods or services
unless accompanied by an official order form.

--
John Horne | Senior Operations Analyst | Technology and Information Services
University of Plymouth | Drake Circus | Plymouth | Devon | PL4 8AA | UK
[http://www.plymouth.ac.uk/images/email_footer.gif]<http://www.plymouth.ac.uk/worldclass>;

This email and any files with it are confidential and intended solely for the use of the recipient to whom it is addressed. If you are not the intended recipient then copying, distribution or other use of the information contained is strictly prohibited and you should not rely on it. If you have received this email in error please let the sender know immediately and delete it from your system(s). Internet emails are not necessarily secure. While we take every care, University of Plymouth accepts no responsibility for viruses and it is your responsibility to scan emails and their attachments. University of Plymouth does not accept responsibility for any changes made after it was sent. Nothing in this email or its attachments constitutes an order for goods or services unless accompanied by an official order form.
list Japheth Cleaver · Thu, 21 Mar 2019 06:03:24 -0700 ·
Hi,

I hadn't announced it yet, but the commit yesterday was in a similar 
vein. Switching to a full regex here for Linux was more complicated (and 
slower!) than simply branching off one of the new net-tools strings. 
This fix will be in 4.3.29.

https://sourceforge.net/p/xymon/code/8036/

Regards,
-jc
quoted from John Horne

On 3/21/2019 4:57 AM, John Horne wrote:
On Mon, 2019-03-18 at 10:30 -0700, Japheth Cleaver wrote:
This is not just a Debian thing -- it looks like it results from output
changes in recent versions of netstat/net-tools (recent Fedora is affected
too).

I think the easiest fix will be to move to PCRE parsing to catch the
differing versions, which is how we're handling the BSDs and some of the
esoteric unicies.
Hi,

I would have agreed, but as far as I can tell the code looks for a specific
order for the data when a regex is used. This is fine for *BSD/UNIX, but is not
valid for Linux. So it cannot use a regex (without changing yet more code).

I thought about defining a new OS type other than 'LINUX' and 'LINUX22', but
felt that the type may well be used in other parts of Xymon. So adding a new
Linux type may cause problems elsewhere.

In the end I made a relatively simple change to just one file (do_netstat.c).
For Linux systems it checks the netstat output for one of the typo strings. If
it is found, then it uses the current code. If it is not found, then it uses
the new set of strings to check the netstat output. (Basically the new set is
the same as the old one, but with the typos corrected.)

I checked the net-tools source and it seems that all the typos affecting Xymon
were fixed in one commit. So checking for just one typo in the code should work
fine.

Patch file attached.


John.
On 3/13/2019 7:53 AM, SebA wrote:
Hi John,

I can confirm that we have the same issue on Debian 9 clients on the
Network I/O (as labelled) graph.

Kind regards,

SebA


On Tue, 12 Mar 2019 at 22:48, John Horne <user-e95f1ec2f147@xymon.invalid> wrote:
Hi,

We have three Debian 9 (stretch) servers running the xymon-client package
(version 4.3.28-2). I noticed yesterday that the 'trends' column for
these
clients was showing a 'TCP/IP statistics' graph, but was only showing
values
for the 'In' graph. Both the 'Out' and 'Retrans' values were NaN.

If anyone else is running the Xymon client on a Debian 9 server, or may
be even
Debian 8, could they check their 'trends' graphs and see if the same
problem
exists for them. Thanks.

As far as I can tell this started when we upgraded the client server O/S
from
Debian 7 to 9 (a few months ago now!). It seems there were values for all
3
graph lines when they ran Debian 7.

The RRD netstat file, which is used for the statistics graph, shows a 'U'
for
the 'Out' and 'Retrans' values. Looking at the actual code (in
xymond/rrd/do_netstat.c), it seems that the netstat output is expected to
be
the same for most Linux distributions and versions. It shows:

=========
/* This one matches all Linux systems */
static char *netstat_linux_markers[] = {
         "packets received",
         "packets sent",
         "packet receive errors",
         "active connections openings",
         "passive connection openings",
         "failed connection attempts",
         "connection resets received",
         "connections established",
         "",
         "",
         "",
         "",
         "segments send out",    /* Yes, they really do write "send" */
         "segments received",
         "",
         "segments retransmited",
         NULL
};
=========

However, the netstat output collected by the clients shows, looking at
the TCP
section:

=========
Tcp:
     3045575 active connection openings
     251770 passive connection openings
     9335 failed connection attempts
     4520 connection resets received
     37 connections established
     1359715245 segments received
     1330630207 segments sent out
     119457 segments retransmitted
     10 bad segments received
     32339 resets sent
     InCsumErrors: 2
=========


Some things seem to be wrong:

1) The code looks for 'active connections openings', but netstat shows
'active
connection openings'. Singular on the 'connection' word.

2) The code looks for 'segments send out', but netstat shows 'segments
sent
out'. So despite the comment in the code, the output uses 'sent' rather
than
'send'.

3) The code looks for 'segments retransmited', but the netstat output
shows
'segments retransmitted'. So there is a double 't' in retransmitted (or 3
all
together).

4) The order of the 'segments' lines is different from the code, but I'm
not
sure if that is important. (Haven't looked at the code that much in
depth.)


I'm not sure if this is the cause of the TCP/IP stats graph not showing
values,
but it doesn't seem right.


Thanks,

John.

--
John Horne | Senior Operations Analyst | Technology and Information
Services
University of Plymouth | Drake Circus | Plymouth | Devon | PL4 8AA | UK
[
http://www.plymouth.ac.uk/images/email_footer.gif]<http://www.plymouth.ac.uk/worldclass
This email and any files with it are confidential and intended solely for
the use of the recipient to whom it is addressed. If you are not the
intended recipient then copying, distribution or other use of the
information contained is strictly prohibited and you should not rely on
it. If you have received this email in error please let the sender know
immediately and delete it from your system(s). Internet emails are not
necessarily secure. While we take every care, University of Plymouth
accepts no responsibility for viruses and it is your responsibility to
scan emails and their attachments. University of Plymouth does not accept
responsibility for any changes made after it was sent. Nothing in this
email or its attachments constitutes an order for goods or services
unless accompanied by an official order form.

--
John Horne | Senior Operations Analyst | Technology and Information Services
University of Plymouth | Drake Circus | Plymouth | Devon | PL4 8AA | UK
[http://www.plymouth.ac.uk/images/email_footer.gif]<http://www.plymouth.ac.uk/worldclass>;

This email and any files with it are confidential and intended solely for the use of the recipient to whom it is addressed. If you are not the intended recipient then copying, distribution or other use of the information contained is strictly prohibited and you should not rely on it. If you have received this email in error please let the sender know immediately and delete it from your system(s). Internet emails are not necessarily secure. While we take every care, University of Plymouth accepts no responsibility for viruses and it is your responsibility to scan emails and their attachments. University of Plymouth does not accept responsibility for any changes made after it was sent. Nothing in this email or its attachments constitutes an order for goods or services unless accompanied by an official order form.