Xymon Mailing List Archive search

Parsing problems with client reports

6 messages in this thread

list Scot Kreienkamp · Tue, 13 Dec 2016 15:01:43 +0000 ·
Hello list,

Has anyone been working on the problem where reports from clients are getting confused on the server?  I am getting multiple false alerts daily now where it will turn back to green at the next test, and when I go to look at the test in xymon it's because it's only got partial information.  In multiple instances now I've been able to see the report from another client, in part or in full, in one clients page.  In the most recent incident last week I could see the beginning client line from another windows client report halfway down in the ports test for a Linux client.

All the false reports are really damaging the trust we place in Xymon to monitor our systems, and people are starting to just delete the warning emails without looking at them because we get so many false reports now.

Thanks

Scot Kreienkamp | Senior Systems Engineer | La-Z-Boy Corporate
One La-Z-Boy Drive | Monroe, Michigan 48162  | * XXX-XXX-XXXX | | * 7349151444 | *  user-9678697f1438@xymon.invalid<mailto:%7BE-mail%7D>
www<http://www.la-z-boy.com/>.la-z-boy.com<http://www.la-z-boy.com/>; | facebook.<https://www.facebook.com/lazboy>com<https://www.facebook.com/lazboy>/<https://www.facebook.com/lazboy>lazboy<http://facebook.com/lazboy>; | twitter.com/lazboy<https://twitter.com/lazboy>; | youtube.com/<https://www.youtube.com/user/lazboy>lazboy<https://www.youtube.com/user/lazboy>;

[cid:lzbVertical_hres.jpg]


This message is intended only for the individual or entity to which it is addressed.  It may contain privileged, confidential information which is exempt from disclosure under applicable laws.  If you are not the intended recipient, you are strictly prohibited from disseminating or distributing this information (other than to the intended recipient) or copying this information.  If you have received this communication in error, please notify us immediately by e-mail or by telephone at the above number. Thank you.
Attachments (1)
list Japheth Cleaver · Tue, 13 Dec 2016 08:59:56 -0800 ·
quoted from Scot Kreienkamp
On 12/13/2016 7:01 AM, Scot Kreienkamp wrote:
Hello list,

Has anyone been working on the problem where reports from clients are getting confused on the server?  I am getting multiple false alerts daily now where it will turn back to green at the next test, and when I go to look at the test in xymon it’s because it’s only got partial information.  In multiple instances now I’ve been able to see the report from another client, in part or in full, in one clients page.  In the most recent incident last week I could see the beginning client line from another windows client report halfway down in the ports test for a Linux client.

All the false reports are really damaging the trust we place in Xymon to monitor our systems, and people are starting to just delete the warning emails without looking at them because we get so many false reports now.

Thanks

*Scot Kreienkamp | Senior Systems Engineer | La-Z-Boy Corporate*

One La-Z-Boy Drive | Monroe, Michigan 48162  |( XXX-XXX-XXXX| |)7349151444| * user-9678697f1438@xymon.invalid <mailto:%7BE-mail%7D>
www <http://www.la-z-boy.com/>.la-z-boy.com <http://www.la-z-boy.com/>; | facebook. <https://www.facebook.com/lazboy>com <https://www.facebook.com/lazboy>/ <https://www.facebook.com/lazboy>lazboy <http://facebook.com/lazboy>; | twitter.com/lazboy <https://twitter.com/lazboy>; | youtube.com/ <https://www.youtube.com/user/lazboy>lazboy <https://www.youtube.com/user/lazboy>;
Can you enable MALLOC_PERTURB_=1 and *MALLOC_CHECK_=3***in the xymon environment? I've noticed this occasionally in the past and added debugging logic, but I wasn't able to reliably duplicate it. I noticed that it did seem to happen more often when the server was under memory pressure and didn't seem to be related to truncated (or garbled) incoming TCP connections.

Also, what version and distro?

-jc
list Scot Kreienkamp · Tue, 13 Dec 2016 20:30:30 +0000 ·
OK, I’ve added them so they get activated by the xymon user.  Not sure what to do next with them…

It’s Oracle Linux Server release 7.2 running the latest RPM install from terabithia.


Scot Kreienkamp  | Senior Systems Engineer | La-Z-Boy Corporate
One La-Z-Boy Drive | Monroe, Michigan 48162 | Office: XXX-XXX-XXXX | | Mobile: XXXXXXXXXX | Email: user-9678697f1438@xymon.invalid
quoted from Japheth Cleaver
From: Xymon [mailto:xymon-bounces at xymon.com] On Behalf Of Japheth Cleaver
Sent: Tuesday, December 13, 2016 12:00 PM
To: xymon at xymon.com
Subject: Re: [Xymon] Parsing problems with client reports

On 12/13/2016 7:01 AM, Scot Kreienkamp wrote:
Hello list,

Has anyone been working on the problem where reports from clients are getting confused on the server?  I am getting multiple false alerts daily now where it will turn back to green at the next test, and when I go to look at the test in xymon it’s because it’s only got partial information.  In multiple instances now I’ve been able to see the report from another client, in part or in full, in one clients page.  In the most recent incident last week I could see the beginning client line from another windows client report halfway down in the ports test for a Linux client.

All the false reports are really damaging the trust we place in Xymon to monitor our systems, and people are starting to just delete the warning emails without looking at them because we get so many false reports now.

Thanks

Scot Kreienkamp | Senior Systems Engineer | La-Z-Boy Corporate
One La-Z-Boy Drive | Monroe, Michigan 48162  | • XXX-XXX-XXXX | | • 7349151444 | •  user-9678697f1438@xymon.invalid<mailto:%7BE-mail%7D>
www<http://www.la-z-boy.com/>.la-z-boy.com<http://www.la-z-boy.com/>; | facebook.<https://www.facebook.com/lazboy>com<https://www.facebook.com/lazboy>/<https://www.facebook.com/lazboy>lazboy<http://facebook.com/lazboy>; | twitter.com/lazboy<https://twitter.com/lazboy>; | youtube.com/<https://www.youtube.com/user/lazboy>lazboy<https://www.youtube.com/user/lazboy>;


Can you enable MALLOC_PERTURB_=1 and MALLOC_CHECK_=3 in the xymon environment? I've noticed this occasionally in the past and added debugging logic, but I wasn't able to reliably duplicate it. I noticed that it did seem to happen more often when the server was under memory pressure and didn't seem to be related to truncated (or garbled) incoming TCP connections.
quoted from Scot Kreienkamp

Also, what version and distro?

-jc

This message is intended only for the individual or entity to which it is addressed.  It may contain privileged, confidential information which is exempt from disclosure under applicable laws.  If you are not the intended recipient, you are strictly prohibited from disseminating or distributing this information (other than to the intended recipient) or copying this information.  If you have received this communication in error, please notify us immediately by e-mail or by telephone at the above number. Thank you.
list Scot Kreienkamp · Wed, 14 Dec 2016 15:48:07 +0000 ·
Not sure what those switches do but the xymon server has been much quieter since they were applied.
signature


Scot Kreienkamp  | Senior Systems Engineer | La-Z-Boy Corporate
One La-Z-Boy Drive | Monroe, Michigan 48162 | Office: XXX-XXX-XXXX | | Mobile: XXXXXXXXXX | Email: user-9678697f1438@xymon.invalid

quoted from Japheth Cleaver
From: Xymon [mailto:xymon-bounces at xymon.com] On Behalf Of Japheth Cleaver
Sent: Tuesday, December 13, 2016 12:00 PM
To: xymon at xymon.com
Subject: Re: [Xymon] Parsing problems with client reports

On 12/13/2016 7:01 AM, Scot Kreienkamp wrote:
Hello list,

Has anyone been working on the problem where reports from clients are getting confused on the server?  I am getting multiple false alerts daily now where it will turn back to green at the next test, and when I go to look at the test in xymon it’s because it’s only got partial information.  In multiple instances now I’ve been able to see the report from another client, in part or in full, in one clients page.  In the most recent incident last week I could see the beginning client line from another windows client report halfway down in the ports test for a Linux client.

All the false reports are really damaging the trust we place in Xymon to monitor our systems, and people are starting to just delete the warning emails without looking at them because we get so many false reports now.

Thanks

Scot Kreienkamp | Senior Systems Engineer | La-Z-Boy Corporate
One La-Z-Boy Drive | Monroe, Michigan 48162  | • XXX-XXX-XXXX | | • 7349151444 | •  user-9678697f1438@xymon.invalid<mailto:%7BE-mail%7D>
www<http://www.la-z-boy.com/>.la-z-boy.com<http://www.la-z-boy.com/>; | facebook.<https://www.facebook.com/lazboy>com<https://www.facebook.com/lazboy>/<https://www.facebook.com/lazboy>lazboy<http://facebook.com/lazboy>; | twitter.com/lazboy<https://twitter.com/lazboy>; | youtube.com/<https://www.youtube.com/user/lazboy>lazboy<https://www.youtube.com/user/lazboy>;


Can you enable MALLOC_PERTURB_=1 and MALLOC_CHECK_=3 in the xymon environment? I've noticed this occasionally in the past and added debugging logic, but I wasn't able to reliably duplicate it. I noticed that it did seem to happen more often when the server was under memory pressure and didn't seem to be related to truncated (or garbled) incoming TCP connections.

Also, what version and distro?

-jc

This message is intended only for the individual or entity to which it is addressed.  It may contain privileged, confidential information which is exempt from disclosure under applicable laws.  If you are not the intended recipient, you are strictly prohibited from disseminating or distributing this information (other than to the intended recipient) or copying this information.  If you have received this communication in error, please notify us immediately by e-mail or by telephone at the above number. Thank you.
list Japheth Cleaver · Wed, 14 Dec 2016 13:46:11 -0800 ·
That's strange...
If anything, those switches should cause any hidden bugs with memory handling to be more quickly exposed, even to the point of aborting.

Have you noticed any free/alloc issues in xymond.log, or any other altered behavior since bouncing it?

-jc
quoted from Scot Kreienkamp

On 12/14/2016 7:48 AM, Scot Kreienkamp wrote:
Not sure what those switches do but the xymon server has been much quieter since they were applied.

*Scot Kreienkamp  | Senior Systems Engineer | La-Z-Boy Corporate*
One La-Z-Boy Drive | Monroe, Michigan 48162 | Office: XXX-XXX-XXXX | | Mobile: XXXXXXXXXX | Email: user-9678697f1438@xymon.invalid

*From:*Xymon [mailto:xymon-bounces at xymon.com] *On Behalf Of *Japheth Cleaver
quoted from Scot Kreienkamp
*Sent:* Tuesday, December 13, 2016 12:00 PM
*To:* xymon at xymon.com
*Subject:* Re: [Xymon] Parsing problems with client reports

On 12/13/2016 7:01 AM, Scot Kreienkamp wrote:

    Hello list,

    Has anyone been working on the problem where reports from clients
    are getting confused on the server?  I am getting multiple false
    alerts daily now where it will turn back to green at the next
    test, and when I go to look at the test in xymon it’s because it’s
    only got partial information.  In multiple instances now I’ve been
    able to see the report from another client, in part or in full, in
    one clients page.  In the most recent incident last week I could
    see the beginning client line from another windows client report
    halfway down in the ports test for a Linux client.

    All the false reports are really damaging the trust we place in
    Xymon to monitor our systems, and people are starting to just
    delete the warning emails without looking at them because we get
    so many false reports now.

    Thanks

    *Scot Kreienkamp | Senior Systems Engineer | La-Z-Boy Corporate*
    One La-Z-Boy Drive | Monroe, Michigan 48162  |( XXX-XXX-XXXX | |
    )7349151444| *user-9678697f1438@xymon.invalid <mailto:%7BE-mail%7D>
    www <http://www.la-z-boy.com/>.la-z-boy.com
    <http://www.la-z-boy.com/>; | facebook.
    <https://www.facebook.com/lazboy>com
    <https://www.facebook.com/lazboy>/
    <https://www.facebook.com/lazboy>lazboy
    <http://facebook.com/lazboy>; | twitter.com/lazboy
    <https://twitter.com/lazboy>; | youtube.com/
    <https://www.youtube.com/user/lazboy>lazboy
    <https://www.youtube.com/user/lazboy>;


Can you enable MALLOC_PERTURB_=1 and *MALLOC_CHECK_=3***in the xymon environment? I've noticed this occasionally in the past and added debugging logic, but I wasn't able to reliably duplicate it. I noticed that it did seem to happen more often when the server was under memory pressure and didn't seem to be related to truncated (or garbled) incoming TCP connections.

Also, what version and distro?

-jc

This messageis intended onlyfor the individual or entity to which it is addressed.  It may contain privileged, confidential information which is exempt from disclosure under applicable laws.  If you are not the intended recipient, you are strictly prohibited from disseminating or distributing this information (other than to the intended recipient) or copying this information.  If you have received this communication in error, please notify usimmediately by e-mail or by telephone at the above number.Thank you.
list Scot Kreienkamp · Thu, 15 Dec 2016 14:53:36 +0000 ·
Nothing.  No different behavior (other than no false alerts), no false alerts, no lines containing free or alloc in any log in /var/log/xymon, nothing.  Just quiet smooth operation.

Maybe in a specific function we don’t use that often?  I’ll leave the lines in there and see if anything happens.  If it continues smooth I’ll take the lines out after a month and see if it goes back to false reports.
signature


Scot Kreienkamp  | Senior Systems Engineer | La-Z-Boy Corporate
One La-Z-Boy Drive | Monroe, Michigan 48162 | Office: XXX-XXX-XXXX | | Mobile: XXXXXXXXXX | Email: user-9678697f1438@xymon.invalid

quoted from Japheth Cleaver
From: Japheth Cleaver [mailto:user-87556346d4af@xymon.invalid]
Sent: Wednesday, December 14, 2016 4:46 PM
To: Scot Kreienkamp; xymon at xymon.com
Subject: Re: [Xymon] Parsing problems with client reports

That's strange...
If anything, those switches should cause any hidden bugs with memory handling to be more quickly exposed, even to the point of aborting.

Have you noticed any free/alloc issues in xymond.log, or any other altered behavior since bouncing it?

-jc

On 12/14/2016 7:48 AM, Scot Kreienkamp wrote:
Not sure what those switches do but the xymon server has been much quieter since they were applied.
From: Xymon [mailto:xymon-bounces at xymon.com] On Behalf Of Japheth Cleaver
Sent: Tuesday, December 13, 2016 12:00 PM

To:xymon at xymon.com<mailto:xymon at xymon.com>
quoted from Japheth Cleaver
Subject: Re: [Xymon] Parsing problems with client reports

On 12/13/2016 7:01 AM, Scot Kreienkamp wrote:
Hello list,

Has anyone been working on the problem where reports from clients are getting confused on the server?  I am getting multiple false alerts daily now where it will turn back to green at the next test, and when I go to look at the test in xymon it’s because it’s only got partial information.  In multiple instances now I’ve been able to see the report from another client, in part or in full, in one clients page.  In the most recent incident last week I could see the beginning client line from another windows client report halfway down in the ports test for a Linux client.

All the false reports are really damaging the trust we place in Xymon to monitor our systems, and people are starting to just delete the warning emails without looking at them because we get so many false reports now.

Thanks

Scot Kreienkamp | Senior Systems Engineer | La-Z-Boy Corporate
One La-Z-Boy Drive | Monroe, Michigan 48162  | • XXX-XXX-XXXX | | • 7349151444 | •  user-9678697f1438@xymon.invalid<mailto:%7BE-mail%7D>
www<http://www.la-z-boy.com/>.la-z-boy.com<http://www.la-z-boy.com/>; | facebook.<https://www.facebook.com/lazboy>com<https://www.facebook.com/lazboy>/<https://www.facebook.com/lazboy>lazboy<http://facebook.com/lazboy>; | twitter.com/lazboy<https://twitter.com/lazboy>; | youtube.com/<https://www.youtube.com/user/lazboy>lazboy<https://www.youtube.com/user/lazboy>;


Can you enable MALLOC_PERTURB_=1 and MALLOC_CHECK_=3in the xymon environment? I've noticed this occasionally in the past and added debugging logic, but I wasn't able to reliably duplicate it. I noticed that it did seem to happen more often when the server was under memory pressure and didn't seem to be related to truncated (or garbled) incoming TCP connections.

Also, what version and distro?

-jc

This message is intended only for the individual or entity to which it is addressed.  It may contain privileged, confidential information which is exempt from disclosure under applicable laws.  If you are not the intended recipient, you are strictly prohibited from disseminating or distributing this information (other than to the intended recipient) or copying this information.  If you have received this communication in error, please notify us immediately by e-mail or by telephone at the above number. Thank you.