leaking memory, alerts.cfg problems
list Tres Finocchiaro
Our organization is one of I'd assume many that are evaluating our options now that Dell no longer supports BigBrother. We had a fairly smooth transition to Xymon (we compiled form source on Ubuntu 14.04.1 LTS x86-64). The problem that we're having is that we can't find a sane way to remove a service check from our switches/routers, specifically TELNET. We removed the column from displaying on our webpage, but we were still receiving alerts on it. We've made customization to our alerts, but now no matter what we try, we can't get alerts to fire and sometimes see messages like: someuser at somehost:~$ sudo vi /etc/xymon/alerts.cfg
[sudo] password for someuser: no talloc stackframe at ../source3/param/loadparm.c:4864, leaking memory
I've scoured the internet to no avail. Any insight? If we can get the product running reliably we're sure to use it for years to come. :) -Tres - user-88678e65ced1@xymon.invalid
list Jeremy Laidman
On 9 September 2014 23:57, Tres Finocchiaro <user-88678e65ced1@xymon.invalid> wrote:
no talloc stackframe at ../source3/param/loadparm.c:4864, leaking memory
This is a bug in Samba. Refer: https://bugs.launchpad.net/ubuntu/+source/samba/+bug/1257186 This is entirely unrelated to Xymon. If you don't have "telnet" listed for a host/device in hosts.cfg, or in the ".default." entry, then Xymon should not include it in its service checks. J
list Tres Finocchiaro
@Jeremy,
▸
This is a bug in Samba. Refer: https://bugs.launchpad.net/ubuntu/+source/samba/+bug/1257186 This is entirely unrelated to Xymon.
Terrific, thank you.
▸
If you don't have "telnet" listed for a host/device in hosts.cfg, or in the ".default." entry, then Xymon should not include it in its service checks.
OK, this is where I'm struggling. We have a separate host group for our switches: group-sorted UFIRST Switches
0.0.0.0 .default. # NOCOLUMNS:telnet 192.168.0.5 SWITCH1 # ssh
But originally had had: 192.168.0.5 SWITCH1 # telnet We followed the instructions for purging the history: xymon 127.0.0.1 "drop SWITCH1 telnet" But we still received alerts. red Wed Sep 3 09:26:38 2014 telnet NOT ok
Service telnet on SWITCH1 is not OK : Service unavailable (Connection refused) Seconds: 0.00
So I tried filtering it in the alerts.cfg: HOST=%* SERVICE=%*
MAIL user-6113c4a3e279@xymon.invalid EXSERVICE=telnet
But now we aren't receiving any alerts at all. To verify it's not an email settings: mail -s "Test Email" user-6113c4a3e279@xymon.invalid < /dev/null Which succeeds. The only part I did not mention is that some of our hosts were renamed from lowercase to uppercase for aesthetics.
list Carl Inglis
From the man page: NOCOLUMNS:column[,column] Used to drop certain of the status columns generated by the Xymon client. column is one of cpu, disk, files, memory, msgs, ports, procs. This setting stops these columns from being updated for the host. Note: If the columns already exist, you must use the xymon<http://uk-netmon/xymon/help/manpages/man1/xymon.1.html>;(1) utility to drop them, or they will go purple. That suggests to me that your “NOCOLUMNS: telnet” is invalid – makes me wonder if it’s triggering that test as a result of that .default. entry. Be interested to see what you find. Regards, Carl
▸
From: Xymon [mailto:xymon-bounces at xymon.com] On Behalf Of Tres Finocchiaro Sent: 10 September 2014 13:37 To: Jeremy Laidman Cc: xymon at xymon.com Subject: Re: [Xymon] leaking memory, alerts.cfg problems @Jeremy, This is a bug in Samba. Refer: https://bugs.launchpad.net/ubuntu/+source/samba/+bug/1257186 This is entirely unrelated to Xymon. Terrific, thank you. If you don't have "telnet" listed for a host/device in hosts.cfg, or in the ".default." entry, then Xymon should not include it in its service checks. OK, this is where I'm struggling. We have a separate host group for our switches: group-sorted UFIRST Switches 0.0.0.0 .default. # NOCOLUMNS:telnet 192.168.0.5 SWITCH1 # ssh But originally had had: 192.168.0.5 SWITCH1 # telnet We followed the instructions for purging the history: xymon 127.0.0.1 "drop SWITCH1 telnet" But we still received alerts. red Wed Sep 3 09:26:38 2014 telnet NOT ok Service telnet on SWITCH1 is not OK : Service unavailable (Connection refused) Seconds: 0.00 So I tried filtering it in the alerts.cfg: HOST=%* SERVICE=%*
MAIL user-6113c4a3e279@xymon.invalid<mailto:user-6113c4a3e279@xymon.invalid> EXSERVICE=telnet
▸
But now we aren't receiving any alerts at all.
To verify it's not an email settings:
mail -s "Test Email" user-6113c4a3e279@xymon.invalid<mailto:user-6113c4a3e279@xymon.invalid> < /dev/null
▸
Which succeeds.
The only part I did not mention is that some of our hosts were renamed from lowercase to uppercase for aesthetics.
Carl Inglis AMBCS
Systems Administrator
Rakon UK Limited
Dowsett House, Sadler Road, Lincoln LN6 3RS, United Kingdom
Tel: +XX XXXX XXXXXX | Fax: +XX XXXX XXXXXX | Mob: +44 7786 552915
user-96685bdc864b@xymon.invalid | www.rakon.com
[The Queens Awards for Enterprise 2012]
[Rakon Logo]
This message together with any attachments contains confidential information and may be
subject to privilege. If you are not the intended recipient you may not distribute it in any
way, you must notify the sender immediately and delete any copies of the message along
with its attachments.
Rakon UK Ltd is a limited company registered in England and Wales.
Registered Office: Antell House, Windsor Place, Harlow, Essex, England, CM20 2GQ
Company Registration Number: 5128090.
Please be aware that Rakon UK Limited may monitor email traffic data
including the date, time, subject line, sender and recipients for the
purposes of security and usage monitoring. Automated monitoring
systems may also be applied to ascertain whether incoming/outgoing
emails are likely to contain viruses, other destructive devices or
inappropriate content.
list Tres Finocchiaro
That suggests to me that your “NOCOLUMNS: telnet” is invalid
How so? My usage matches the manpage, no?
list Carl Inglis
The manpage says “column is one of cpu, disk, files, memory, msgs, ports, procs.” – it doesn’t mention “telnet” as an option to drop. Regards, Carl
▸
From: Tres Finocchiaro [mailto:user-88678e65ced1@xymon.invalid]
Sent: 10 September 2014 13:50
To: Carl Inglis
Cc: xymon at xymon.com
Subject: RE: [Xymon] leaking memory, alerts.cfg problems
That suggests to me that your “NOCOLUMNS: telnet” is invalid
How so? My usage matches the manpage, no? Carl Inglis AMBCS Systems Administrator Rakon UK Limited Dowsett House, Sadler Road, Lincoln LN6 3RS, United Kingdom Tel: +XX XXXX XXXXXX | Fax: +XX XXXX XXXXXX | Mob: +44 7786 552915 user-96685bdc864b@xymon.invalid | www.rakon.com [The Queens Awards for Enterprise 2012] [Rakon Logo] This message together with any attachments contains confidential information and may be subject to privilege. If you are not the intended recipient you may not distribute it in any way, you must notify the sender immediately and delete any copies of the message along with its attachments. Rakon UK Ltd is a limited company registered in England and Wales. Registered Office: Antell House, Windsor Place, Harlow, Essex, England, CM20 2GQ Company Registration Number: 5128090. Please be aware that Rakon UK Limited may monitor email traffic data including the date, time, subject line, sender and recipients for the purposes of security and usage monitoring. Automated monitoring systems may also be applied to ascertain whether incoming/outgoing emails are likely to contain viruses, other destructive devices or inappropriate content.
list Tres Finocchiaro
▸
The manpage says “*column* is one of *cpu*, *disk*, *files*, *memory*,*msgs*, *ports*, *procs*.” – it doesn’t mention “telnet” as an option to drop.
Well, then the manpage should be updated.
Here is the page normally:http://i.imgur.com/8bKAzvg.png
And here is the page with "0.0.0.0 .default. # NOCOLUMNS:telnet" added.
http://i.imgur.com/0DYn78E.png
To entertain the suggestion, I've commented out the .default. line too but still no alerts... I'm not sure where else to look.
list Glauber Ribeiro
Maybe I’m confused, but what’s the point of using NOCOLUMMS for something that’s not one of the statuses that is sent by the client? If you want to omit telnet, simply don’t list it in hosts.cfg for the host. If you want to drop an existing column, look into “Tips and Tricks” under the Help menu, and navigate to “How do I delete a test status?”. NOCOLUMNS is for something that the client will continue to send, and you can’t turn off, but don’t want to display. glauber
▸
From: Xymon [mailto:xymon-bounces at xymon.com] On Behalf Of Tres Finocchiaro
Sent: Wednesday, September 10, 2014 09:31
To: Carl Inglis
Cc: xymon at xymon.com
Subject: Re: [Xymon] leaking memory, alerts.cfg problems
The manpage says “column is one of cpu, disk, files, memory, msgs, ports, procs.” – it doesn’t mention “telnet” as an option to drop.
Well, then the manpage should be updated. Here is the page normally:
http://i.imgur.com/8bKAzvg.png
And here is the page with "0.0.0.0 .default. # NOCOLUMNS:telnet" added.
http://i.imgur.com/0DYn78E.png
To entertain the suggestion, I've commented out the .default. line too but still no alerts... I'm not sure where else to look.
list Carl Inglis
AH! You have a purple status – looks to me like the “drop” command didn’t work. Perhaps try dropping it with the hostname in lower case? I’ve found in the past that drop can be temperamental and I’ve on occasion ended up dropping the whole host to check that I’ve got it right.
▸
Regards.
Carl
From: Tres Finocchiaro [mailto:user-88678e65ced1@xymon.invalid]
Sent: 10 September 2014 15:31
To: Carl Inglis
Cc: xymon at xymon.com
Subject: Re: [Xymon] leaking memory, alerts.cfg problems
The manpage says “column is one of cpu, disk, files, memory, msgs, ports, procs.” – it doesn’t mention “telnet” as an option to drop.
Well, then the manpage should be updated. Here is the page normally:
http://i.imgur.com/8bKAzvg.png
And here is the page with "0.0.0.0 .default. # NOCOLUMNS:telnet" added.
http://i.imgur.com/0DYn78E.png
To entertain the suggestion, I've commented out the .default. line too but still no alerts... I'm not sure where else to look. Carl Inglis AMBCS Systems Administrator Rakon UK Limited Dowsett House, Sadler Road, Lincoln LN6 3RS, United Kingdom Tel: +XX XXXX XXXXXX | Fax: +XX XXXX XXXXXX | Mob: +44 7786 552915 user-96685bdc864b@xymon.invalid | www.rakon.com [The Queens Awards for Enterprise 2012] [Rakon Logo] This message together with any attachments contains confidential information and may be subject to privilege. If you are not the intended recipient you may not distribute it in any way, you must notify the sender immediately and delete any copies of the message along with its attachments. Rakon UK Ltd is a limited company registered in England and Wales. Registered Office: Antell House, Windsor Place, Harlow, Essex, England, CM20 2GQ Company Registration Number: 5128090. Please be aware that Rakon UK Limited may monitor email traffic data including the date, time, subject line, sender and recipients for the purposes of security and usage monitoring. Automated monitoring systems may also be applied to ascertain whether incoming/outgoing emails are likely to contain viruses, other destructive devices or inappropriate content.
list Tres Finocchiaro
@Ribeiro, Thanks kindly for the reply. Please see my responses below.
▸
Maybe I’m confused, but what’s the point of using NOCOLUMMS for something that’s not one of the statuses that is sent by the client?
I agree. I mentioned in my second post that I ran the "drop" statement per instructions. I've done this several time, but telnet keeps coming back. Hiding it using NOCOLUMNS is the only way to keep the page from going purple.
If you want to omit telnet, simply don’t list it in hosts.cfg for the host.
As aforementioned, we don't list in in hosts.cfg, we list # ssh instead. I'm not sure why it keeps coming back.
▸
If you want to drop an existing column, look into “Tips and Tricks” under the Help menu, and navigate to “How do I delete a test status?”.
As aforementioned, we did this.
▸
NOCOLUMNS is for something that the client will continue to send, and you can’t turn off, but don’t want to display.
Since the client is a switch, I'm not sure we have the ability to alter anything from the "send" side of the fence. Let me know if this assumption is incorrect. @Carl, I just saw your reply come in. I'll try your recommendations now. -Tres - user-88678e65ced1@xymon.invalid On Wed, Sep 10, 2014 at 10:46 AM, Ribeiro, Glauber <
▸
user-59d088777028@xymon.invalid> wrote:
Maybe I’m confused, but what’s the point of using NOCOLUMMS for something that’s not one of the statuses that is sent by the client? If you want to omit telnet, simply don’t list it in hosts.cfg for the host. If you want to drop an existing column, look into “Tips and Tricks” under the Help menu, and navigate to “How do I delete a test status?”. NOCOLUMNS is for something that the client will continue to send, and you can’t turn off, but don’t want to display. glauber *From:* Xymon [mailto:xymon-bounces at xymon.com] *On Behalf Of *Tres Finocchiaro *Sent:* Wednesday, September 10, 2014 09:31 *To:* Carl Inglis *Cc:* xymon at xymon.com *Subject:* Re: [Xymon] leaking memory, alerts.cfg problemsThe manpage says “*column* is one of *cpu*, *disk*, *files*, *memory*,*msgs*, *ports*, *procs*.” – it doesn’t mention “telnet” as an option to drop. Well, then the manpage should be updated. Here is the page normally:http://i.imgur.com/8bKAzvg.pngAnd here is the page with "0.0.0.0 .default. # NOCOLUMNS:telnet" added.http://i.imgur.com/0DYn78E.pngTo entertain the suggestion, I've commented out the .default. line too but still no alerts... I'm not sure where else to look.
list Tres Finocchiaro
@Carl, Thank, I believe that corrected the lingering telnet. It has not reappeared, thank you. But I'm still not receiving alerts. We've intentionally left SSH disabled on this switch to trigger an alert, but emails don't seem to be going through. What is the command to trigger a test email from xymon? I found this in the archives:
$ xymoncmd xymon_alert --test SWITCH1 ssh
But I get this:
2014-09-10 11:14:01 Using default environment file /usr/lib/xymon/client/etc/xymonserver.cfg 2014-09-10 11:14:01 execvp() failed: No such file or directory
I assume xymon doesn't stop sending alerts after a timeout period, right? We chose to use the SSH service since it doesn't affect our environment. We can add a fake hostname/ip as well. I fear may have something misconfigured since alerts won't go through at all. -Tres
list Paul Root
▸
From: Xymon [mailto:xymon-bounces at xymon.com] On Behalf Of Tres Finocchiaro Sent: Wednesday, September 10, 2014 10:22 AM To: Ribeiro, Glauber Cc: xymon at xymon.com Subject: Re: [Xymon] leaking memory, alerts.cfg problems But I get this: 2014-09-10 11:14:01 Using default environment file /usr/lib/xymon/client/etc/xymonserver.cfg 2014-09-10 11:14:01 execvp() failed: No such file or directory
This is a problem. Xymoncmd couldn’t find xymon_alert. So it’s not in your path, you need to give the full path.
I assume xymon doesn't stop sending alerts after a timeout period, right? We chose to use the SSH service since it doesn't affect our environment. We can add a fake hostname/ip as well. I fear may have something misconfigured since alerts won't go through at all.
No. xymon sends alerts by what you define for it to send. There are no alerts by default.
-Tres
list Paul Root
Oh, it’s xymond_alert not xymon_alert.
▸
From: Root, Paul T
Sent: Wednesday, September 10, 2014 10:38 AM
To: 'Tres Finocchiaro'; 'Ribeiro, Glauber'
Cc: 'xymon at xymon.com'
Subject: RE: [Xymon] leaking memory, alerts.cfg problems
From: Xymon [mailto:xymon-bounces at xymon.com] On Behalf Of Tres Finocchiaro
Sent: Wednesday, September 10, 2014 10:22 AM
To: Ribeiro, Glauber
Cc: xymon at xymon.com<mailto:xymon at xymon.com>
Subject: Re: [Xymon] leaking memory, alerts.cfg problems
But I get this:
2014-09-10 11:14:01 Using default environment file /usr/lib/xymon/client/etc/xymonserver.cfg
2014-09-10 11:14:01 execvp() failed: No such file or directory
This is a problem. Xymoncmd couldn’t find xymon_alert. So it’s not in your path, you need to give the full path.
I assume xymon doesn't stop sending alerts after a timeout period, right? We chose to use the SSH service since it doesn't affect our environment. We can add a fake hostname/ip as well. I fear may have something misconfigured since alerts won't go through at all.
No. xymon sends alerts by what you define for it to send. There are no alerts by default.
-Tres
list Carl Inglis
As Paul pointed out, use “xymond_alert” rather than “xymon_alert”. One of the most useful debug tools I’ve found for alerting is: Add --cfid to the CMD in the [alerts] section of tasks.cfg It means that when you *do* get your alerts working, you’ll get the line number of test line that fired in the subject of the message; this allows you to identify exactly which rule triggered the message. Good luck. ☺
▸
From: Tres Finocchiaro [mailto:user-88678e65ced1@xymon.invalid]
Sent: 10 September 2014 16:22
To: Ribeiro, Glauber
Cc: Carl Inglis; xymon at xymon.com
Subject: Re: [Xymon] leaking memory, alerts.cfg problems
@Carl,
Thank, I believe that corrected the lingering telnet. It has not reappeared, thank you.
But I'm still not receiving alerts. We've intentionally left SSH disabled on this switch to trigger an alert, but emails don't seem to be going through.
What is the command to trigger a test email from xymon? I found this in the archives:
$ xymoncmd xymon_alert --test SWITCH1 ssh
But I get this:
2014-09-10 11:14:01 Using default environment file /usr/lib/xymon/client/etc/xymonserver.cfg
2014-09-10 11:14:01 execvp() failed: No such file or directory
I assume xymon doesn't stop sending alerts after a timeout period, right? We chose to use the SSH service since it doesn't affect our environment. We can add a fake hostname/ip as well. I fear may have something misconfigured since alerts won't go through at all.
-Tres
Carl Inglis AMBCS
Systems Administrator
Rakon UK Limited
Dowsett House, Sadler Road, Lincoln LN6 3RS, United Kingdom
Tel: +XX XXXX XXXXXX | Fax: +XX XXXX XXXXXX | Mob: +44 7786 552915
user-96685bdc864b@xymon.invalid | www.rakon.com
[The Queens Awards for Enterprise 2012]
[Rakon Logo]
This message together with any attachments contains confidential information and may be
subject to privilege. If you are not the intended recipient you may not distribute it in any
way, you must notify the sender immediately and delete any copies of the message along
with its attachments.
Rakon UK Ltd is a limited company registered in England and Wales.
Registered Office: Antell House, Windsor Place, Harlow, Essex, England, CM20 2GQ
Company Registration Number: 5128090.
Please be aware that Rakon UK Limited may monitor email traffic data
including the date, time, subject line, sender and recipients for the
purposes of security and usage monitoring. Automated monitoring
systems may also be applied to ascertain whether incoming/outgoing
emails are likely to contain viruses, other destructive devices or
inappropriate content.
list Tres Finocchiaro
Oh, it’s xymond_alert not xymon_alert.
Fantastic, that gets me further. So it appears to be breaking on my Perl-compatible percent-asterisk (%*) in the alert.cfg. Apparently my wildcard is invalid. I change the line from *HOST=%** to *HOST=** and alerts are working great. Should I use *HOST=**, is this the recommended method? I would assume the alert.cfg to have a basic example for "all hosts" but the best I could find was the Perl example. -Tres - user-88678e65ced1@xymon.invalid On Wed, Sep 10, 2014 at 11:40 AM, Root, Paul T <user-76fdb6883669@xymon.invalid>
▸
wrote:
Oh, it’s xymond_alert not xymon_alert. *From:* Root, Paul T *Sent:* Wednesday, September 10, 2014 10:38 AM *To:* 'Tres Finocchiaro'; 'Ribeiro, Glauber' *Cc:* 'xymon at xymon.com' *Subject:* RE: [Xymon] leaking memory, alerts.cfg problems
*From:* Xymon [mailto:xymon-bounces at xymon.com <xymon-bounces at xymon.com>] *On
▸
Behalf Of *Tres Finocchiaro
*Sent:* Wednesday, September 10, 2014 10:22 AM
*To:* Ribeiro, Glauber
*Cc:* xymon at xymon.com
*Subject:* Re: [Xymon] leaking memory, alerts.cfg problems
But I get this:
2014-09-10 11:14:01 Using default environment file
/usr/lib/xymon/client/etc/xymonserver.cfg
2014-09-10 11:14:01 execvp() failed: No such file or directory
This is a problem. Xymoncmd couldn’t find xymon_alert. So it’s not in your
path, you need to give the full path.
I assume xymon doesn't stop sending alerts after a timeout period, right?
We chose to use the SSH service since it doesn't affect our environment.
We can add a fake hostname/ip as well. I fear may have something
misconfigured since alerts won't go through at all.
No. xymon sends alerts by what you define for it to send. There are no
alerts by default.
-Tres
list Jeremy Laidman
On 11 September 2014 01:50, Tres Finocchiaro <user-88678e65ced1@xymon.invalid>
▸
wrote:
Fantastic, that gets me further. So it appears to be breaking on my Perl-compatible percent-asterisk (%*) in the alert.cfg. Apparently my wildcard is invalid.
That's because "%*" is invalid. The "%" means what follows is a perl-compatible regular expression (PCRE). An asterisk in a PCRE means "zero or more of the preceding symbol". If there's no preceding symbol, then the PCRE is invalid. What you probably mean is "%.*", where the dot means "any character" and so ".*" means "zero or more of any character".
I change the line from *HOST=%** to *HOST=** and alerts are working great.
This works because you're no longer using an RE, but are matching due to a special case in the host-matching code, where an asterisk means to match everything. But the effect is the same as if you did "%.*".
▸
Should I use *HOST=**, is this the recommended method? I would assume thealert.cfg to have a basic example for "all hosts" but the best I could find was the Perl example.
Given that "HOST=*" is an undocumented feature, and that it looks like a glob pattern but isn't, I would recommend you use "HOST=%.*". J
list Tres Finocchiaro
▸
Given that "HOST=*" is an undocumented feature, and that it looks like a glob pattern but isn't, I would recommend you use "HOST=%.*".
Top notch. Thanks.