GROUPs and recovery alerts
list Heather Keen
Hi,
I want to be able to group servers so that alerts for one bunch of servers
go to one group of people, and another group of servesr to another group of
people.
I used the following config -
analysis.cfg:
HOST=myhost.mydomain.com GROUP=mygroup
PROC blah
PROC blahblah
DISK ....etc
alerts.cfg:
GROUP=mygroup NOTICE RECOVERED COLOR=red
MAIL user-b75305ea6ec0@xymon.invalid REPEAT=15
Now, I tested this by stopping one of the PROCs listed, and I successfully
received the alert e-mail. However, when I restarted that process to clear
the alert, the status goes green but I do not receive any recovery message.
I also tried just having the GROUP defined against each individual PROC line
(rather than against the HOST), but that didn't result in a recovery message
either.
Am I doing something wrong, or is this a bug?
I can provide the debug output from the alert process if required.
Cheers,
Heather
list Paul Root
I generally put the RECOVERED on the mail line. Paul Root - Engineer III - Qwest is becoming CenturyLink
▸
From: xymon-bounces at xymon.com [mailto:xymon-bounces at xymon.com] On Behalf Of Heather Keen
Sent: Friday, July 01, 2011 11:47 AM
To: xymon at xymon.com
Subject: [Xymon] GROUPs and recovery alerts
Hi,
I want to be able to group servers so that alerts for one bunch of servers go to one group of people, and another group of servesr to another group of people.
I used the following config -
analysis.cfg:HOST=myhost.mydomain.com<http://myhost.mydomain.com>; GROUP=mygroup PROC blah PROC blahblah DISK ....etc alerts.cfg: GROUP=mygroup NOTICE RECOVERED COLOR=red MAIL user-b75305ea6ec0@xymon.invalid<mailto:user-b75305ea6ec0@xymon.invalid> REPEAT=15
▸
Now, I tested this by stopping one of the PROCs listed, and I successfully received the alert e-mail. However, when I restarted that process to clear the alert, the status goes green but I do not receive any recovery message.
I also tried just having the GROUP defined against each individual PROC line (rather than against the HOST), but that didn't result in a recovery message either.
Am I doing something wrong, or is this a bug?
I can provide the debug output from the alert process if required.
Cheers,
Heather
This communication is the property of Qwest and may contain confidential or
privileged information. Unauthorized use of this communication is strictly
prohibited and may be unlawful. If you have received this communication
in error, please immediately notify the sender by reply e-mail and destroy
all copies of the communication and any attachments.
list Heather Keen
Yeah, I tried that too. No joy.
▸
On 1 July 2011 19:59, Root, Paul <user-c80045f511e8@xymon.invalid> wrote:
I generally put the RECOVERED on the mail line.****
** **
** **
Paul Root - Engineer III - Qwest is becoming CenturyLink****
** **
*From:* xymon-bounces at xymon.com [mailto:xymon-bounces at xymon.com] *On
Behalf Of *Heather Keen
*Sent:* Friday, July 01, 2011 11:47 AM
*To:* xymon at xymon.com
*Subject:* [Xymon] GROUPs and recovery alerts****
** **
Hi,****
** **
I want to be able to group servers so that alerts for one bunch of servers
go to one group of people, and another group of servesr to another group of
people.****
** **
I used the following config -****
** **
analysis.cfg:****
HOST=myhost.mydomain.com GROUP=mygroup****
PROC blah****
PROC blahblah****
DISK ....etc****
** **
alerts.cfg:****
GROUP=mygroup NOTICE RECOVERED COLOR=red****
MAIL user-b75305ea6ec0@xymon.invalid REPEAT=15****
** **
** **
Now, I tested this by stopping one of the PROCs listed, and I successfully
received the alert e-mail. However, when I restarted that process to clear
the alert, the status goes green but I do not receive any recovery message.
****
** **
I also tried just having the GROUP defined against each individual PROC
line (rather than against the HOST), but that didn't result in a recovery
message either.****
** **
Am I doing something wrong, or is this a bug?****
** **
I can provide the debug output from the alert process if required.****
** **
Cheers,****
Heather****
This communication is the property of Qwest and may contain confidential or
privileged information. Unauthorized use of this communication is strictly
prohibited and may be unlawful. If you have received this communication
in error, please immediately notify the sender by reply e-mail and destroy
all copies of the communication and any attachments.
list Paul Root
Have you tried using the test program to see how it acts for the failure? /usr/lib64/xymon/server/bin/hobbitd_alert --test myhost procs --duration=500 |grep -v Failed
▸
Paul Root - Engineer III - Qwest is becoming CenturyLink From: Heather Keen [mailto:user-4ed3d6202a3b@xymon.invalid] Sent: Saturday, July 02, 2011 12:51 PM To: Root, Paul Cc: xymon at xymon.com Subject: Re: [Xymon] GROUPs and recovery alerts Yeah, I tried that too. No joy. On 1 July 2011 19:59, Root, Paul <user-c80045f511e8@xymon.invalid<mailto:user-c80045f511e8@xymon.invalid>> wrote: I generally put the RECOVERED on the mail line. Paul Root - Engineer III - Qwest is becoming CenturyLink From: xymon-bounces at xymon.com<mailto:xymon-bounces at xymon.com> [mailto:xymon-bounces at xymon.com<mailto:xymon-bounces at xymon.com>] On Behalf Of Heather Keen Sent: Friday, July 01, 2011 11:47 AM To: xymon at xymon.com<mailto:xymon at xymon.com> Subject: [Xymon] GROUPs and recovery alerts Hi, I want to be able to group servers so that alerts for one bunch of servers go to one group of people, and another group of servesr to another group of people. I used the following config - analysis.cfg: HOST=myhost.mydomain.com<http://myhost.mydomain.com>; GROUP=mygroup PROC blah PROC blahblah DISK ....etc alerts.cfg: GROUP=mygroup NOTICE RECOVERED COLOR=red MAIL user-b75305ea6ec0@xymon.invalid<mailto:user-b75305ea6ec0@xymon.invalid> REPEAT=15 Now, I tested this by stopping one of the PROCs listed, and I successfully received the alert e-mail. However, when I restarted that process to clear the alert, the status goes green but I do not receive any recovery message. I also tried just having the GROUP defined against each individual PROC line (rather than against the HOST), but that didn't result in a recovery message either. Am I doing something wrong, or is this a bug? I can provide the debug output from the alert process if required. Cheers, Heather This communication is the property of Qwest and may contain confidential or privileged information. Unauthorized use of this communication is strictly prohibited and may be unlawful. If you have received this communication in error, please immediately notify the sender by reply e-mail and destroy all copies of the communication and any attachments. This communication is the property of Qwest and may contain confidential or privileged information. Unauthorized use of this communication is strictly prohibited and may be unlawful. If you have received this communication in error, please immediately notify the sender by reply e-mail and destroy all copies of the communication and any attachments.
list Heather Keen
The test for the alert works fine - but how do you mimic a "recovered" message with the --test option? I don't think you can.
▸
On 3 July 2011 00:29, Root, Paul <user-c80045f511e8@xymon.invalid> wrote:
Have you tried using the test program to see how it acts for the failure?
****
** **
/usr/lib64/xymon/server/bin/hobbitd_alert --test myhost procs
--duration=500 |grep -v Failed****
** **
Paul Root - Engineer III - Qwest is becoming CenturyLink****
** **
*From:* Heather Keen [mailto:user-4ed3d6202a3b@xymon.invalid]
*Sent:* Saturday, July 02, 2011 12:51 PM
*To:* Root, Paul
*Cc:* xymon at xymon.com
*Subject:* Re: [Xymon] GROUPs and recovery alerts****
** **
Yeah, I tried that too. No joy.****
On 1 July 2011 19:59, Root, Paul <user-c80045f511e8@xymon.invalid> wrote:****
I generally put the RECOVERED on the mail line.****
****
****
Paul Root - Engineer III - Qwest is becoming CenturyLink****
****
*From:* xymon-bounces at xymon.com [mailto:xymon-bounces at xymon.com] *On
Behalf Of *Heather Keen
*Sent:* Friday, July 01, 2011 11:47 AM
*To:* xymon at xymon.com
*Subject:* [Xymon] GROUPs and recovery alerts****
****
Hi,****
****
I want to be able to group servers so that alerts for one bunch of servers
go to one group of people, and another group of servesr to another group of
people.****
****
I used the following config -****
****
analysis.cfg:****
HOST=myhost.mydomain.com GROUP=mygroup****
PROC blah****
PROC blahblah****
DISK ....etc****
****
alerts.cfg:****
GROUP=mygroup NOTICE RECOVERED COLOR=red****
MAIL user-b75305ea6ec0@xymon.invalid REPEAT=15****
****
****
Now, I tested this by stopping one of the PROCs listed, and I successfully
received the alert e-mail. However, when I restarted that process to clear
the alert, the status goes green but I do not receive any recovery message.
****
****
I also tried just having the GROUP defined against each individual PROC
line (rather than against the HOST), but that didn't result in a recovery
message either.****
****
Am I doing something wrong, or is this a bug?****
****
I can provide the debug output from the alert process if required.****
****
Cheers,****
Heather****
** **
This communication is the property of Qwest and may contain confidential or
privileged information. Unauthorized use of this communication is strictly
prohibited and may be unlawful. If you have received this communication
in error, please immediately notify the sender by reply e-mail and destroy
all copies of the communication and any attachments.****
** **
This communication is the property of Qwest and may contain confidential or
privileged information. Unauthorized use of this communication is strictly
prohibited and may be unlawful. If you have received this communication
in error, please immediately notify the sender by reply e-mail and destroy
all copies of the communication and any attachments.
list Paul Root
You can't. I've never used notice. What does it do? I've never had any luck with recovered on anything except the Mail line.
▸
Paul Root - Engineer III - Qwest is becoming CenturyLink From: Heather Keen [mailto:user-4ed3d6202a3b@xymon.invalid] Sent: Monday, July 04, 2011 5:13 AM To: Root, Paul Cc: xymon at xymon.com Subject: Re: [Xymon] GROUPs and recovery alerts The test for the alert works fine - but how do you mimic a "recovered" message with the --test option? I don't think you can. On 3 July 2011 00:29, Root, Paul <user-c80045f511e8@xymon.invalid<mailto:user-c80045f511e8@xymon.invalid>> wrote: Have you tried using the test program to see how it acts for the failure? /usr/lib64/xymon/server/bin/hobbitd_alert --test myhost procs --duration=500 |grep -v Failed Paul Root - Engineer III - Qwest is becoming CenturyLink From: Heather Keen [mailto:user-4ed3d6202a3b@xymon.invalid<mailto:user-4ed3d6202a3b@xymon.invalid>] Sent: Saturday, July 02, 2011 12:51 PM To: Root, Paul Cc: xymon at xymon.com<mailto:xymon at xymon.com> Subject: Re: [Xymon] GROUPs and recovery alerts Yeah, I tried that too. No joy. On 1 July 2011 19:59, Root, Paul <user-c80045f511e8@xymon.invalid<mailto:user-c80045f511e8@xymon.invalid>> wrote: I generally put the RECOVERED on the mail line. Paul Root - Engineer III - Qwest is becoming CenturyLink From: xymon-bounces at xymon.com<mailto:xymon-bounces at xymon.com> [mailto:xymon-bounces at xymon.com<mailto:xymon-bounces at xymon.com>] On Behalf Of Heather Keen Sent: Friday, July 01, 2011 11:47 AM To: xymon at xymon.com<mailto:xymon at xymon.com> Subject: [Xymon] GROUPs and recovery alerts Hi, I want to be able to group servers so that alerts for one bunch of servers go to one group of people, and another group of servesr to another group of people. I used the following config - analysis.cfg: HOST=myhost.mydomain.com<http://myhost.mydomain.com>; GROUP=mygroup PROC blah PROC blahblah DISK ....etc alerts.cfg: GROUP=mygroup NOTICE RECOVERED COLOR=red MAIL user-b75305ea6ec0@xymon.invalid<mailto:user-b75305ea6ec0@xymon.invalid> REPEAT=15 Now, I tested this by stopping one of the PROCs listed, and I successfully received the alert e-mail. However, when I restarted that process to clear the alert, the status goes green but I do not receive any recovery message. I also tried just having the GROUP defined against each individual PROC line (rather than against the HOST), but that didn't result in a recovery message either. Am I doing something wrong, or is this a bug? I can provide the debug output from the alert process if required. Cheers, Heather This communication is the property of Qwest and may contain confidential or privileged information. Unauthorized use of this communication is strictly prohibited and may be unlawful. If you have received this communication in error, please immediately notify the sender by reply e-mail and destroy all copies of the communication and any attachments. This communication is the property of Qwest and may contain confidential or privileged information. Unauthorized use of this communication is strictly prohibited and may be unlawful. If you have received this communication in error, please immediately notify the sender by reply e-mail and destroy all copies of the communication and any attachments. This communication is the property of Qwest and may contain confidential or privileged information. Unauthorized use of this communication is strictly prohibited and may be unlawful. If you have received this communication in error, please immediately notify the sender by reply e-mail and destroy all copies of the communication and any attachments.
list Heather Keen
NOTICE means that when you enable or disable an alert you'll get a msg. I can get RECOVERED to work on HOST or MAIL or SCRIPT lines, just as long as they aren't in a GROUP.
▸
On 4 July 2011 16:48, Root, Paul <user-c80045f511e8@xymon.invalid> wrote:
You can’t.****
** **
I’ve never used notice. What does it do?****
** **
I’ve never had any luck with recovered on anything except the Mail line.**
**
** **
** **
Paul Root - Engineer III - Qwest is becoming CenturyLink****
** **
*From:* Heather Keen [mailto:user-4ed3d6202a3b@xymon.invalid]
*Sent:* Monday, July 04, 2011 5:13 AM
*To:* Root, Paul
*Cc:* xymon at xymon.com
*Subject:* Re: [Xymon] GROUPs and recovery alerts****
** **
The test for the alert works fine - but how do you mimic a "recovered"
message with the --test option? I don't think you can.****
** **
On 3 July 2011 00:29, Root, Paul <user-c80045f511e8@xymon.invalid> wrote:****
Have you tried using the test program to see how it acts for the failure?*
***
****
/usr/lib64/xymon/server/bin/hobbitd_alert --test myhost procs
--duration=500 |grep -v Failed****
****
Paul Root - Engineer III - Qwest is becoming CenturyLink****
****
*From:* Heather Keen [mailto:user-4ed3d6202a3b@xymon.invalid]
*Sent:* Saturday, July 02, 2011 12:51 PM
*To:* Root, Paul
*Cc:* xymon at xymon.com
*Subject:* Re: [Xymon] GROUPs and recovery alerts****
****
Yeah, I tried that too. No joy.****
On 1 July 2011 19:59, Root, Paul <user-c80045f511e8@xymon.invalid> wrote:****
I generally put the RECOVERED on the mail line.****
****
****
Paul Root - Engineer III - Qwest is becoming CenturyLink****
****
*From:* xymon-bounces at xymon.com [mailto:xymon-bounces at xymon.com] *On
Behalf Of *Heather Keen
*Sent:* Friday, July 01, 2011 11:47 AM
*To:* xymon at xymon.com
*Subject:* [Xymon] GROUPs and recovery alerts****
****
Hi,****
****
I want to be able to group servers so that alerts for one bunch of servers
go to one group of people, and another group of servesr to another group of
people.****
****
I used the following config -****
****
analysis.cfg:****
HOST=myhost.mydomain.com GROUP=mygroup****
PROC blah****
PROC blahblah****
DISK ....etc****
****
alerts.cfg:****
GROUP=mygroup NOTICE RECOVERED COLOR=red****
MAIL user-b75305ea6ec0@xymon.invalid REPEAT=15****
****
****
Now, I tested this by stopping one of the PROCs listed, and I successfully
received the alert e-mail. However, when I restarted that process to clear
the alert, the status goes green but I do not receive any recovery message.
****
****
I also tried just having the GROUP defined against each individual PROC
line (rather than against the HOST), but that didn't result in a recovery
message either.****
****
Am I doing something wrong, or is this a bug?****
****
I can provide the debug output from the alert process if required.****
****
Cheers,****
Heather****
****
This communication is the property of Qwest and may contain confidential or
privileged information. Unauthorized use of this communication is strictly
prohibited and may be unlawful. If you have received this communication
in error, please immediately notify the sender by reply e-mail and destroy
all copies of the communication and any attachments.****
****
** **
This communication is the property of Qwest and may contain confidential or
privileged information. Unauthorized use of this communication is strictly
prohibited and may be unlawful. If you have received this communication
in error, please immediately notify the sender by reply e-mail and destroy
all copies of the communication and any attachments.****
** **
This communication is the property of Qwest and may contain confidential or
privileged information. Unauthorized use of this communication is strictly
prohibited and may be unlawful. If you have received this communication
in error, please immediately notify the sender by reply e-mail and destroy
all copies of the communication and any attachments.
list Asif Iqbal
▸
On Mon, Jul 4, 2011 at 12:19 PM, Heather Keen <user-4ed3d6202a3b@xymon.invalid> wrote:
NOTICE means that when you enable or disable an alert you'll get a msg. I can get RECOVERED to work on HOST or MAIL or SCRIPT lines, just as long as they aren't in a GROUP.
RECOVERED and COLOR=RED on same line?
▸
On 4 July 2011 16:48, Root, Paul <user-c80045f511e8@xymon.invalid> wrote:You can’t. I’ve never used notice. What does it do? I’ve never had any luck with recovered on anything except the Mail line. Paul Root - Engineer III - Qwest is becoming CenturyLink From: Heather Keen [mailto:user-4ed3d6202a3b@xymon.invalid] Sent: Monday, July 04, 2011 5:13 AM To: Root, Paul Cc: xymon at xymon.com Subject: Re: [Xymon] GROUPs and recovery alerts The test for the alert works fine - but how do you mimic a "recovered" message with the --test option? I don't think you can. On 3 July 2011 00:29, Root, Paul <user-c80045f511e8@xymon.invalid> wrote: Have you tried using the test program to see how it acts for the failure? /usr/lib64/xymon/server/bin/hobbitd_alert --test myhost procs --duration=500 |grep -v Failed Paul Root - Engineer III - Qwest is becoming CenturyLink From: Heather Keen [mailto:user-4ed3d6202a3b@xymon.invalid] Sent: Saturday, July 02, 2011 12:51 PM To: Root, Paul Cc: xymon at xymon.com Subject: Re: [Xymon] GROUPs and recovery alerts Yeah, I tried that too. No joy. On 1 July 2011 19:59, Root, Paul <user-c80045f511e8@xymon.invalid> wrote: I generally put the RECOVERED on the mail line. Paul Root - Engineer III - Qwest is becoming CenturyLink From: xymon-bounces at xymon.com [mailto:xymon-bounces at xymon.com] On Behalf Of Heather Keen Sent: Friday, July 01, 2011 11:47 AM To: xymon at xymon.com Subject: [Xymon] GROUPs and recovery alerts Hi, I want to be able to group servers so that alerts for one bunch of servers go to one group of people, and another group of servesr to another group of people. I used the following config - analysis.cfg: HOST=myhost.mydomain.com GROUP=mygroup PROC blah PROC blahblah DISK ....etc alerts.cfg: GROUP=mygroup NOTICE RECOVERED COLOR=red MAIL user-b75305ea6ec0@xymon.invalid REPEAT=15 Now, I tested this by stopping one of the PROCs listed, and I successfully received the alert e-mail. However, when I restarted that process to clear the alert, the status goes green but I do not receive any recovery message. I also tried just having the GROUP defined against each individual PROC line (rather than against the HOST), but that didn't result in a recovery message either. Am I doing something wrong, or is this a bug? I can provide the debug output from the alert process if required. Cheers, Heather This communication is the property of Qwest and may contain confidential or privileged information. Unauthorized use of this communication is strictly prohibited and may be unlawful. If you have received this communication in error, please immediately notify the sender by reply e-mail and destroy all copies of the communication and any attachments. This communication is the property of Qwest and may contain confidential or privileged information. Unauthorized use of this communication is strictly prohibited and may be unlawful. If you have received this communication in error, please immediately notify the sender by reply e-mail and destroy all copies of the communication and any attachments. This communication is the property of Qwest and may contain confidential or privileged information. Unauthorized use of this communication is strictly prohibited and may be unlawful. If you have received this communication in error, please immediately notify the sender by reply e-mail and destroy all copies of the communication and any attachments.
--
Asif Iqbal
PGP Key: 0xE62693C5 KeyServer: pgp.mit.edu
A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?
list Asif Iqbal
On Mon, Jul 4, 2011 at 12:19 PM, Heather Keen <user-4ed3d6202a3b@xymon.invalid> wrote:
NOTICE means that when you enable or disable an alert you'll get a msg. I can get RECOVERED to work on HOST or MAIL or SCRIPT lines, just as long as they aren't in a GROUP.
RECOVERED and COLOR=RED on same line?
On 4 July 2011 16:48, Root, Paul <user-c80045f511e8@xymon.invalid> wrote:You can’t. I’ve never used notice. What does it do? I’ve never had any luck with recovered on anything except the Mail line. Paul Root - Engineer III - Qwest is becoming CenturyLink From: Heather Keen [mailto:user-4ed3d6202a3b@xymon.invalid] Sent: Monday, July 04, 2011 5:13 AM To: Root, Paul Cc: xymon at xymon.com Subject: Re: [Xymon] GROUPs and recovery alerts The test for the alert works fine - but how do you mimic a "recovered" message with the --test option? I don't think you can. On 3 July 2011 00:29, Root, Paul <user-c80045f511e8@xymon.invalid> wrote: Have you tried using the test program to see how it acts for the failure? /usr/lib64/xymon/server/bin/hobbitd_alert --test myhost procs --duration=500 |grep -v Failed Paul Root - Engineer III - Qwest is becoming CenturyLink From: Heather Keen [mailto:user-4ed3d6202a3b@xymon.invalid] Sent: Saturday, July 02, 2011 12:51 PM To: Root, Paul Cc: xymon at xymon.com Subject: Re: [Xymon] GROUPs and recovery alerts Yeah, I tried that too. No joy. On 1 July 2011 19:59, Root, Paul <user-c80045f511e8@xymon.invalid> wrote: I generally put the RECOVERED on the mail line. Paul Root - Engineer III - Qwest is becoming CenturyLink From: xymon-bounces at xymon.com [mailto:xymon-bounces at xymon.com] On Behalf Of Heather Keen Sent: Friday, July 01, 2011 11:47 AM To: xymon at xymon.com Subject: [Xymon] GROUPs and recovery alerts Hi, I want to be able to group servers so that alerts for one bunch of servers go to one group of people, and another group of servesr to another group of people. I used the following config - analysis.cfg: HOST=myhost.mydomain.com GROUP=mygroup PROC blah PROC blahblah DISK ....etc alerts.cfg: GROUP=mygroup NOTICE RECOVERED COLOR=red MAIL user-b75305ea6ec0@xymon.invalid REPEAT=15 Now, I tested this by stopping one of the PROCs listed, and I successfully received the alert e-mail. However, when I restarted that process to clear the alert, the status goes green but I do not receive any recovery message. I also tried just having the GROUP defined against each individual PROC line (rather than against the HOST), but that didn't result in a recovery message either. Am I doing something wrong, or is this a bug? I can provide the debug output from the alert process if required. Cheers, Heather This communication is the property of Qwest and may contain confidential or privileged information. Unauthorized use of this communication is strictly prohibited and may be unlawful. If you have received this communication in error, please immediately notify the sender by reply e-mail and destroy all copies of the communication and any attachments. This communication is the property of Qwest and may contain confidential or privileged information. Unauthorized use of this communication is strictly prohibited and may be unlawful. If you have received this communication in error, please immediately notify the sender by reply e-mail and destroy all copies of the communication and any attachments. This communication is the property of Qwest and may contain confidential or privileged information. Unauthorized use of this communication is strictly prohibited and may be unlawful. If you have received this communication in error, please immediately notify the sender by reply e-mail and destroy all copies of the communication and any attachments.
-- Asif Iqbal PGP Key: 0xE62693C5 KeyServer: pgp.mit.edu A: Because it messes up the order in which people normally read text. Q: Why is top-posting such a bad thing?
list Heather Keen
Aye, there is nothing wrong with that.
Anyway, I think this is a BUG.
Xymon Version 4.3.3.
Configuration as follows:
analysis.cfg:
HOST=myhost.mydomain.com GROUP=heather
PROC TESTtestTEST 1
(equally, you could have the GROUP entry on the PROC line, doesn't matter in
this instance as the result is the same)
alerts.cfg:
HOST=*
MAIL user-f6063d158e7c@xymon.invalid RECOVERED
GROUP=heather
MAIL user-498ed3e78c69@xymon.invalid RECOVERED
When the alert is generated, both e-mail addresses get the notification.
But when the alert is cleared, only user-f6063d158e7c@xymon.invalid gets the recovery
message.
I've tried lots of different configuration options, and the only conclusion
I can come to is that recovery messages to GROUPs do not work. :(
▸
On 4 July 2011 17:58, Asif Iqbal <user-6f4b51ac2a40@xymon.invalid> wrote:
On Mon, Jul 4, 2011 at 12:19 PM, Heather Keen <user-4ed3d6202a3b@xymon.invalid> wrote:NOTICE means that when you enable or disable an alert you'll get a msg. I can get RECOVERED to work on HOST or MAIL or SCRIPT lines, just as long as they aren't in a GROUP.RECOVERED and COLOR=RED on same line?On 4 July 2011 16:48, Root, Paul <user-c80045f511e8@xymon.invalid> wrote:You can’t. I’ve never used notice. What does it do? I’ve never had any luck with recovered on anything except the Mail line. Paul Root - Engineer III - Qwest is becoming CenturyLink From: Heather Keen [mailto:user-4ed3d6202a3b@xymon.invalid] Sent: Monday, July 04, 2011 5:13 AM To: Root, Paul Cc: xymon at xymon.com Subject: Re: [Xymon] GROUPs and recovery alerts The test for the alert works fine - but how do you mimic a "recovered" message with the --test option? I don't think you can. On 3 July 2011 00:29, Root, Paul <user-c80045f511e8@xymon.invalid> wrote: Have you tried using the test program to see how it acts for the failure? /usr/lib64/xymon/server/bin/hobbitd_alert --test myhost procs --duration=500 |grep -v Failed Paul Root - Engineer III - Qwest is becoming CenturyLink From: Heather Keen [mailto:user-4ed3d6202a3b@xymon.invalid] Sent: Saturday, July 02, 2011 12:51 PM To: Root, Paul Cc: xymon at xymon.com Subject: Re: [Xymon] GROUPs and recovery alerts Yeah, I tried that too. No joy. On 1 July 2011 19:59, Root, Paul <user-c80045f511e8@xymon.invalid> wrote: I generally put the RECOVERED on the mail line. Paul Root - Engineer III - Qwest is becoming CenturyLink From: xymon-bounces at xymon.com [mailto:xymon-bounces at xymon.com] OnBehalfOf Heather Keen Sent: Friday, July 01, 2011 11:47 AM To: xymon at xymon.com Subject: [Xymon] GROUPs and recovery alerts Hi, I want to be able to group servers so that alerts for one bunch of servers go to one group of people, and another group of servesr to another group of people. I used the following config - analysis.cfg: HOST=myhost.mydomain.com GROUP=mygroup PROC blah PROC blahblah DISK ....etc alerts.cfg: GROUP=mygroup NOTICE RECOVERED COLOR=red MAIL user-b75305ea6ec0@xymon.invalid REPEAT=15 Now, I tested this by stopping one of the PROCs listed, and I successfully received the alert e-mail. However, when I restarted that process to clear the alert, the status goes green but I do not receive any recovery message. I also tried just having the GROUP defined against each individual PROC line (rather than against the HOST), but that didn't result in a recovery message either. Am I doing something wrong, or is this a bug? I can provide the debug output from the alert process if required. Cheers, Heather This communication is the property of Qwest and may contain confidential or privileged information. Unauthorized use of this communication is strictly prohibited and may be unlawful. If you have received this communication in error, please immediately notify the sender by reply e-mail and destroy all copies of the communication and any attachments. This communication is the property of Qwest and may contain confidential or privileged information. Unauthorized use of this communication is strictly prohibited and may be unlawful. If you have received this communication in error, please immediately notify the sender by reply e-mail and destroy all copies of the communication and any attachments. This communication is the property of Qwest and may contain confidential or privileged information. Unauthorized use of this communication is strictly prohibited and may be unlawful. If you have received this communication in error, please immediately notify the sender by reply e-mail and destroy all copies of the communication and any attachments.-- Asif Iqbal PGP Key: 0xE62693C5 KeyServer: pgp.mit.edu A: Because it messes up the order in which people normally read text. Q: Why is top-posting such a bad thing?
list Dominique Frise
I'm afraid you are right ! We have also noticed that RECOVERED msgs do not get sent to GROUPs (using Xymon-4.3.2) Dominique
▸
On 07/ 5/11 11:58 AM, Heather Keen wrote:Aye, there is nothing wrong with that. Anyway, I think this is a BUG. Xymon Version 4.3.3. Configuration as follows: analysis.cfg:
HOST=myhost.mydomain.com <http://myhost.mydomain.com>; GROUP=heather
▸
PROC TESTtestTEST 1
(equally, you could have the GROUP entry on the PROC line, doesn't
matter in this instance as the result is the same)
alerts.cfg:
HOST=*
MAIL user-f6063d158e7c@xymon.invalid <mailto:user-f6063d158e7c@xymon.invalid> RECOVERED
GROUP=heather
MAIL user-498ed3e78c69@xymon.invalid <mailto:user-498ed3e78c69@xymon.invalid> RECOVERED
▸
When the alert is generated, both e-mail addresses get the notification.
But when the alert is cleared, only user-f6063d158e7c@xymon.invalid
<mailto:user-f6063d158e7c@xymon.invalid> gets the recovery message.
▸
I've tried lots of different configuration options, and the only conclusion I can come to is that recovery messages to GROUPs do not work. :( On 4 July 2011 17:58, Asif Iqbal <user-6f4b51ac2a40@xymon.invalid <mailto:user-6f4b51ac2a40@xymon.invalid>> wrote: On Mon, Jul 4, 2011 at 12:19 PM, Heather Keen <user-4ed3d6202a3b@xymon.invalid <mailto:user-4ed3d6202a3b@xymon.invalid>> wrote:NOTICE means that when you enable or disable an alert you'll get a msg. I can get RECOVERED to work on HOST or MAIL or SCRIPT lines, just as long as they aren't in a GROUP.RECOVERED and COLOR=RED on same line?On 4 July 2011 16:48, Root, Paul <user-c80045f511e8@xymon.invalid<mailto:user-c80045f511e8@xymon.invalid>> wrote:You can’t. I’ve never used notice. What does it do? I’ve never had any luck with recovered on anything except theMail line.Paul Root - Engineer III - Qwest is becoming CenturyLink From: Heather Keen [mailto:user-4ed3d6202a3b@xymon.invalid<mailto:user-4ed3d6202a3b@xymon.invalid>]Sent: Monday, July 04, 2011 5:13 AM To: Root, Paul
Cc: xymon at xymon.com <mailto:xymon at xymon.com>
▸
Subject: Re: [Xymon] GROUPs and recovery alerts The test for the alert works fine - but how do you mimic a"recovered"message with the --test option? I don't think you can. On 3 July 2011 00:29, Root, Paul <user-c80045f511e8@xymon.invalid<mailto:user-c80045f511e8@xymon.invalid>> wrote:Have you tried using the test program to see how it acts for the failure? /usr/lib64/xymon/server/bin/hobbitd_alert --test myhost procs --duration=500 |grep -v Failed Paul Root - Engineer III - Qwest is becoming CenturyLink From: Heather Keen [mailto:user-4ed3d6202a3b@xymon.invalid<mailto:user-4ed3d6202a3b@xymon.invalid>]Sent: Saturday, July 02, 2011 12:51 PM To: Root, Paul
Cc: xymon at xymon.com <mailto:xymon at xymon.com>
▸
Subject: Re: [Xymon] GROUPs and recovery alerts Yeah, I tried that too. No joy. On 1 July 2011 19:59, Root, Paul <user-c80045f511e8@xymon.invalid<mailto:user-c80045f511e8@xymon.invalid>> wrote:I generally put the RECOVERED on the mail line. Paul Root - Engineer III - Qwest is becoming CenturyLink From: xymon-bounces at xymon.com <mailto:xymon-bounces at xymon.com>
[mailto:xymon-bounces at xymon.com <mailto:xymon-bounces at xymon.com>] On BehalfOf Heather Keen Sent: Friday, July 01, 2011 11:47 AM To: xymon at xymon.com <mailto:xymon at xymon.com>
▸
Subject: [Xymon] GROUPs and recovery alerts Hi, I want to be able to group servers so that alerts for one bunch of servers go to one group of people, and another group of servesr to another group of people. I used the following config - analysis.cfg:
HOST=myhost.mydomain.com <http://myhost.mydomain.com>; GROUP=mygroup PROC blah PROC blahblah DISK ....etc alerts.cfg: GROUP=mygroup NOTICE RECOVERED COLOR=red MAIL user-b75305ea6ec0@xymon.invalid <mailto:user-b75305ea6ec0@xymon.invalid> REPEAT=15
▸
Now, I tested this by stopping one of the PROCs listed, and I successfully received the alert e-mail. However, when I restarted that process to clear the alert, the status goes green but I do not receive any recovery message. I also tried just having the GROUP defined against each individual PROC line (rather than against the HOST), but that didn't result in a recovery message either. Am I doing something wrong, or is this a bug? I can provide the debug output from the alert process if required. Cheers, Heather This communication is the property of Qwest and may contain confidential or privileged information. Unauthorized use of this communication is strictly prohibited and may be unlawful. If you have received this communication in error, please immediately notify the sender by reply e-mail and destroy all copies of the communication and any attachments. This communication is the property of Qwest and may contain confidential or privileged information. Unauthorized use of this communication is strictly prohibited and may be unlawful. If you have received this communication in error, please immediately notify the sender by reply e-mail and destroy all copies of the communication and any attachments. This communication is the property of Qwest and may contain confidential or privileged information. Unauthorized use of this communication is strictly prohibited and may be unlawful. If you have received this communication in error, please immediately notify the sender by reply e-mail and destroy all copies of the communication and any attachments.-- Asif Iqbal
PGP Key: 0xE62693C5 KeyServer: pgp.mit.edu <http://pgp.mit.edu>;
▸
A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?
list Henrik Størner
▸
On 05-07-2011 11:58, Heather Keen wrote:
Anyway, I think this is a BUG.
Xymon Version 4.3.3.
Configuration as follows:
analysis.cfg:
HOST=myhost.mydomain.com GROUP=heather
PROC TESTtestTEST 1
alerts.cfg:
HOST=*
MAIL user-f6063d158e7c@xymon.invalid RECOVERED
GROUP=heather
MAIL user-498ed3e78c69@xymon.invalid RECOVERED
When the alert is generated, both e-mail addresses get the notification.
But when the alert is cleared, only user-f6063d158e7c@xymon.invalid
<mailto:user-f6063d158e7c@xymon.invalid> gets the recovery message.
I've tried lots of different configuration options, and the only
conclusion I can come to is that recovery messages to GROUPs do not
work. :(It's certainly not what you would expect - must agree with that. But solving it is not quite as easy as one would expect. The problem is that when the PROC triggers a red status, Xymon knows that the rule was one that included a "GROUP=heather" setting. But when the recovery happens, it is because none of the rules in analysis.cfg triggered. So Xymon does not know that the green status is a recovery from a rule that contained the GROUP setting. There is some state lost here. To solve this, the xymond_alert module will have to keep track of the active alerts, and which GROUP settings triggered them. When the recovery happens, it will then use that list of groups that received the alert as the basis for sending out the recovered-notices. It can be solved, of course. Just don't be disappointed when you see 4.3.4 being released later today without a fix for this problem. Regards, Henrik
list Henrik Størner
▸
On 01-08-2011 17:14, Henrik Størner wrote:
On 05-07-2011 11:58, Heather Keen wrote:I've tried lots of different configuration options, and the only conclusion I can come to is that recovery messages to GROUPs do not work. :(It's certainly not what you would expect - must agree with that. But solving it is not quite as easy as one would expect.
After looking at this once again, I actually think there is a very simple solution to this after all. If we don't check the GROUP rules at all for recovery-messages (i.e. any group setting will match), then xymond_alert will consider all the possible recipients. However, there is another check so it only sends recovery-messages to those recipients that actually did receive the alert. So I think the attached patch should solve this. Regards, Henrik
list Heather Keen
▸
On 1 August 2011 16:37, Henrik Størner <user-ce4a2c883f75@xymon.invalid> wrote:
On 01-08-2011 17:14, Henrik Størner wrote:On 05-07-2011 11:58, Heather Keen wrote:I've tried lots of different configuration options, and the only conclusion I can come to is that recovery messages to GROUPs do not work. :(It's certainly not what you would expect - must agree with that. But solving it is not quite as easy as one would expect.After looking at this once again, I actually think there is a very simple solution to this after all. If we don't check the GROUP rules at all for recovery-messages (i.e. any group setting will match), then xymond_alert will consider all the possible recipients. However, there is another check so it only sends recovery-messages to those recipients that actually did receive the alert. So I think the attached patch should solve this. Regards, Henrik
Henrik,
I've been doing a bit more testing with alerts using GROUPS, and I've
discovered a slight flaw with this solution, when you are using SCRIPT as
the recipient rather than MAIL. Because it doesn't check the GROUP when it
sends a RECOVERED message, you can end up getting multiple RECOVERED
messages sent to the same person. (tested with v4.3.7)
For example:
GROUP=A SERVICE=procs RECOVERED COLOR=red
SCRIPT /home/xymon/server/ext/sms_notification 447777123456 FORMAT=SMS
DURATION>5
GROUP=B SERVICE=procs RECOVERED COLOR=red
SCRIPT /home/xymon/server/ext/sms_notification 447777123456 FORMAT=SMS
DURATION>10
So you've got two groups of machines, each having the same recipient,
but needing a different alert delay.
Now, if procs goes red on a machine in group A, the red alert is handled
fine, but when it recovers, 447777123456 actually gets two recovery
messages.
Note this only happens if the recipient is a SCRIPT command, it works fine
if you use MAIL recipients.
Help!
Cheers,
Heather