Xymon Mailing List Archive search

GROUPs and recovery alerts

14 messages in this thread

list Heather Keen · Fri, 1 Jul 2011 17:47:19 +0100 ·
Hi,

I want to be able to group servers so that alerts for one bunch of servers
go to one group of people, and another group of servesr to another group of
people.

I used the following config -

analysis.cfg:
HOST=myhost.mydomain.com GROUP=mygroup
     PROC blah
     PROC blahblah
     DISK ....etc

alerts.cfg:
GROUP=mygroup NOTICE RECOVERED COLOR=red
        MAIL user-b75305ea6ec0@xymon.invalid REPEAT=15


Now, I tested this by stopping one of the PROCs listed, and I successfully
received the alert e-mail.  However, when I restarted that process to clear
the alert, the status goes green but I do not receive any recovery message.

I also tried just having the GROUP defined against each individual PROC line
(rather than against the HOST), but that didn't result in a recovery message
either.

Am I doing something wrong, or is this a bug?

I can provide the debug output from the alert process if required.

Cheers,
Heather
list Paul Root · Fri, 1 Jul 2011 13:59:45 -0500 ·
I generally put the RECOVERED on the mail line.


Paul Root    - Engineer III  - Qwest is becoming CenturyLink
quoted from Heather Keen

From: xymon-bounces at xymon.com [mailto:xymon-bounces at xymon.com] On Behalf Of Heather Keen
Sent: Friday, July 01, 2011 11:47 AM
To: xymon at xymon.com
Subject: [Xymon] GROUPs and recovery alerts

Hi,

I want to be able to group servers so that alerts for one bunch of servers go to one group of people, and another group of servesr to another group of people.

I used the following config -

analysis.cfg:

HOST=myhost.mydomain.com<http://myhost.mydomain.com>; GROUP=mygroup
     PROC blah
     PROC blahblah
     DISK ....etc

alerts.cfg:
GROUP=mygroup NOTICE RECOVERED COLOR=red
        MAIL user-b75305ea6ec0@xymon.invalid<mailto:user-b75305ea6ec0@xymon.invalid> REPEAT=15
quoted from Heather Keen


Now, I tested this by stopping one of the PROCs listed, and I successfully received the alert e-mail.  However, when I restarted that process to clear the alert, the status goes green but I do not receive any recovery message.

I also tried just having the GROUP defined against each individual PROC line (rather than against the HOST), but that didn't result in a recovery message either.

Am I doing something wrong, or is this a bug?

I can provide the debug output from the alert process if required.

Cheers,
Heather

This communication is the property of Qwest and may contain confidential or
privileged information. Unauthorized use of this communication is strictly
prohibited and may be unlawful. If you have received this communication
in error, please immediately notify the sender by reply e-mail and destroy
all copies of the communication and any attachments.
list Heather Keen · Sat, 2 Jul 2011 18:50:35 +0100 ·
Yeah, I tried that too. No joy.
quoted from Paul Root

On 1 July 2011 19:59, Root, Paul <user-c80045f511e8@xymon.invalid> wrote:
 I generally put the RECOVERED on the mail line.****

** **

** **

Paul Root    - Engineer III  - Qwest is becoming CenturyLink****

** **

*From:* xymon-bounces at xymon.com [mailto:xymon-bounces at xymon.com] *On
Behalf Of *Heather Keen
*Sent:* Friday, July 01, 2011 11:47 AM
*To:* xymon at xymon.com
*Subject:* [Xymon] GROUPs and recovery alerts****

** **

Hi,****

** **

I want to be able to group servers so that alerts for one bunch of servers
go to one group of people, and another group of servesr to another group of
people.****

** **

I used the following config -****

** **

analysis.cfg:****

HOST=myhost.mydomain.com GROUP=mygroup****

     PROC blah****

     PROC blahblah****

     DISK ....etc****

** **

alerts.cfg:****

GROUP=mygroup NOTICE RECOVERED COLOR=red****

        MAIL user-b75305ea6ec0@xymon.invalid REPEAT=15****

** **

** **

Now, I tested this by stopping one of the PROCs listed, and I successfully
received the alert e-mail.  However, when I restarted that process to clear
the alert, the status goes green but I do not receive any recovery message.
****

** **

I also tried just having the GROUP defined against each individual PROC
line (rather than against the HOST), but that didn't result in a recovery
message either.****

** **

Am I doing something wrong, or is this a bug?****

** **

I can provide the debug output from the alert process if required.****

** **

Cheers,****

Heather****

This communication is the property of Qwest and may contain confidential or
privileged information. Unauthorized use of this communication is strictly
prohibited and may be unlawful. If you have received this communication
in error, please immediately notify the sender by reply e-mail and destroy
all copies of the communication and any attachments.
list Paul Root · Sat, 2 Jul 2011 18:29:05 -0500 ·
Have you tried using the test program to see how it acts for the failure?

/usr/lib64/xymon/server/bin/hobbitd_alert --test myhost procs --duration=500 |grep -v Failed
quoted from Heather Keen

Paul Root    - Engineer III  - Qwest is becoming CenturyLink

From: Heather Keen [mailto:user-4ed3d6202a3b@xymon.invalid]
Sent: Saturday, July 02, 2011 12:51 PM
To: Root, Paul
Cc: xymon at xymon.com
Subject: Re: [Xymon] GROUPs and recovery alerts

Yeah, I tried that too. No joy.
On 1 July 2011 19:59, Root, Paul <user-c80045f511e8@xymon.invalid<mailto:user-c80045f511e8@xymon.invalid>> wrote:
I generally put the RECOVERED on the mail line.


Paul Root    - Engineer III  - Qwest is becoming CenturyLink

From: xymon-bounces at xymon.com<mailto:xymon-bounces at xymon.com> [mailto:xymon-bounces at xymon.com<mailto:xymon-bounces at xymon.com>] On Behalf Of Heather Keen
Sent: Friday, July 01, 2011 11:47 AM
To: xymon at xymon.com<mailto:xymon at xymon.com>
Subject: [Xymon] GROUPs and recovery alerts

Hi,

I want to be able to group servers so that alerts for one bunch of servers go to one group of people, and another group of servesr to another group of people.

I used the following config -

analysis.cfg:
HOST=myhost.mydomain.com<http://myhost.mydomain.com>; GROUP=mygroup
     PROC blah
     PROC blahblah
     DISK ....etc

alerts.cfg:
GROUP=mygroup NOTICE RECOVERED COLOR=red
        MAIL user-b75305ea6ec0@xymon.invalid<mailto:user-b75305ea6ec0@xymon.invalid> REPEAT=15


Now, I tested this by stopping one of the PROCs listed, and I successfully received the alert e-mail.  However, when I restarted that process to clear the alert, the status goes green but I do not receive any recovery message.

I also tried just having the GROUP defined against each individual PROC line (rather than against the HOST), but that didn't result in a recovery message either.

Am I doing something wrong, or is this a bug?

I can provide the debug output from the alert process if required.

Cheers,
Heather

This communication is the property of Qwest and may contain confidential or
privileged information. Unauthorized use of this communication is strictly
prohibited and may be unlawful. If you have received this communication
in error, please immediately notify the sender by reply e-mail and destroy
all copies of the communication and any attachments.


This communication is the property of Qwest and may contain confidential or
privileged information. Unauthorized use of this communication is strictly
prohibited and may be unlawful. If you have received this communication
in error, please immediately notify the sender by reply e-mail and destroy
all copies of the communication and any attachments.
list Heather Keen · Mon, 4 Jul 2011 11:13:28 +0100 ·
The test for the alert works fine - but how do you mimic a "recovered"
message with the --test option?  I don't think you can.
quoted from Paul Root


On 3 July 2011 00:29, Root, Paul <user-c80045f511e8@xymon.invalid> wrote:
 Have you tried using the test program to see how it acts for the failure?
****

** **

/usr/lib64/xymon/server/bin/hobbitd_alert --test myhost procs
--duration=500 |grep -v Failed****

** **

Paul Root    - Engineer III  - Qwest is becoming CenturyLink****

** **

*From:* Heather Keen [mailto:user-4ed3d6202a3b@xymon.invalid]
*Sent:* Saturday, July 02, 2011 12:51 PM
*To:* Root, Paul
*Cc:* xymon at xymon.com
*Subject:* Re: [Xymon] GROUPs and recovery alerts****

** **

Yeah, I tried that too. No joy.****

On 1 July 2011 19:59, Root, Paul <user-c80045f511e8@xymon.invalid> wrote:****

I generally put the RECOVERED on the mail line.****

 ****

 ****

Paul Root    - Engineer III  - Qwest is becoming CenturyLink****

 ****

*From:* xymon-bounces at xymon.com [mailto:xymon-bounces at xymon.com] *On
Behalf Of *Heather Keen
*Sent:* Friday, July 01, 2011 11:47 AM
*To:* xymon at xymon.com
*Subject:* [Xymon] GROUPs and recovery alerts****

 ****

Hi,****

 ****

I want to be able to group servers so that alerts for one bunch of servers
go to one group of people, and another group of servesr to another group of
people.****

 ****

I used the following config -****

 ****

analysis.cfg:****

HOST=myhost.mydomain.com GROUP=mygroup****

     PROC blah****

     PROC blahblah****

     DISK ....etc****

 ****

alerts.cfg:****

GROUP=mygroup NOTICE RECOVERED COLOR=red****

        MAIL user-b75305ea6ec0@xymon.invalid REPEAT=15****

 ****

 ****

Now, I tested this by stopping one of the PROCs listed, and I successfully
received the alert e-mail.  However, when I restarted that process to clear
the alert, the status goes green but I do not receive any recovery message.
****

 ****

I also tried just having the GROUP defined against each individual PROC
line (rather than against the HOST), but that didn't result in a recovery
message either.****

 ****

Am I doing something wrong, or is this a bug?****

 ****

I can provide the debug output from the alert process if required.****

 ****

Cheers,****

Heather****

** **

This communication is the property of Qwest and may contain confidential or
privileged information. Unauthorized use of this communication is strictly
prohibited and may be unlawful. If you have received this communication
in error, please immediately notify the sender by reply e-mail and destroy
all copies of the communication and any attachments.****

** **

This communication is the property of Qwest and may contain confidential or
privileged information. Unauthorized use of this communication is strictly
prohibited and may be unlawful. If you have received this communication
in error, please immediately notify the sender by reply e-mail and destroy
all copies of the communication and any attachments.
list Paul Root · Mon, 4 Jul 2011 10:48:35 -0500 ·
You can't.

I've never used notice. What does it do?

I've never had any luck with recovered on anything except the Mail line.
quoted from Heather Keen


Paul Root    - Engineer III  - Qwest is becoming CenturyLink

From: Heather Keen [mailto:user-4ed3d6202a3b@xymon.invalid]
Sent: Monday, July 04, 2011 5:13 AM
To: Root, Paul
Cc: xymon at xymon.com
Subject: Re: [Xymon] GROUPs and recovery alerts

The test for the alert works fine - but how do you mimic a "recovered" message with the --test option?  I don't think you can.

On 3 July 2011 00:29, Root, Paul <user-c80045f511e8@xymon.invalid<mailto:user-c80045f511e8@xymon.invalid>> wrote:
Have you tried using the test program to see how it acts for the failure?

/usr/lib64/xymon/server/bin/hobbitd_alert --test myhost procs --duration=500 |grep -v Failed

Paul Root    - Engineer III  - Qwest is becoming CenturyLink

From: Heather Keen [mailto:user-4ed3d6202a3b@xymon.invalid<mailto:user-4ed3d6202a3b@xymon.invalid>]
Sent: Saturday, July 02, 2011 12:51 PM
To: Root, Paul
Cc: xymon at xymon.com<mailto:xymon at xymon.com>
Subject: Re: [Xymon] GROUPs and recovery alerts

Yeah, I tried that too. No joy.
On 1 July 2011 19:59, Root, Paul <user-c80045f511e8@xymon.invalid<mailto:user-c80045f511e8@xymon.invalid>> wrote:
I generally put the RECOVERED on the mail line.


Paul Root    - Engineer III  - Qwest is becoming CenturyLink

From: xymon-bounces at xymon.com<mailto:xymon-bounces at xymon.com> [mailto:xymon-bounces at xymon.com<mailto:xymon-bounces at xymon.com>] On Behalf Of Heather Keen
Sent: Friday, July 01, 2011 11:47 AM
To: xymon at xymon.com<mailto:xymon at xymon.com>
Subject: [Xymon] GROUPs and recovery alerts

Hi,

I want to be able to group servers so that alerts for one bunch of servers go to one group of people, and another group of servesr to another group of people.

I used the following config -

analysis.cfg:
HOST=myhost.mydomain.com<http://myhost.mydomain.com>; GROUP=mygroup
     PROC blah
     PROC blahblah
     DISK ....etc

alerts.cfg:
GROUP=mygroup NOTICE RECOVERED COLOR=red
        MAIL user-b75305ea6ec0@xymon.invalid<mailto:user-b75305ea6ec0@xymon.invalid> REPEAT=15


Now, I tested this by stopping one of the PROCs listed, and I successfully received the alert e-mail.  However, when I restarted that process to clear the alert, the status goes green but I do not receive any recovery message.

I also tried just having the GROUP defined against each individual PROC line (rather than against the HOST), but that didn't result in a recovery message either.

Am I doing something wrong, or is this a bug?

I can provide the debug output from the alert process if required.

Cheers,
Heather

This communication is the property of Qwest and may contain confidential or
privileged information. Unauthorized use of this communication is strictly
prohibited and may be unlawful. If you have received this communication
in error, please immediately notify the sender by reply e-mail and destroy
all copies of the communication and any attachments.


This communication is the property of Qwest and may contain confidential or
privileged information. Unauthorized use of this communication is strictly
prohibited and may be unlawful. If you have received this communication
in error, please immediately notify the sender by reply e-mail and destroy
all copies of the communication and any attachments.


This communication is the property of Qwest and may contain confidential or
privileged information. Unauthorized use of this communication is strictly
prohibited and may be unlawful. If you have received this communication
in error, please immediately notify the sender by reply e-mail and destroy
all copies of the communication and any attachments.
list Heather Keen · Mon, 4 Jul 2011 17:19:17 +0100 ·
NOTICE means that when you enable or disable an alert you'll get a msg.

I can get RECOVERED to work on HOST or MAIL or SCRIPT lines, just as long as
they aren't in a GROUP.
quoted from Paul Root


On 4 July 2011 16:48, Root, Paul <user-c80045f511e8@xymon.invalid> wrote:
 You can’t.****

** **

I’ve never used notice. What does it do?****

** **

I’ve never had any luck with recovered on anything except the Mail line.**
**

** **

** **

Paul Root    - Engineer III  - Qwest is becoming CenturyLink****

** **

*From:* Heather Keen [mailto:user-4ed3d6202a3b@xymon.invalid]
*Sent:* Monday, July 04, 2011 5:13 AM

*To:* Root, Paul
*Cc:* xymon at xymon.com
*Subject:* Re: [Xymon] GROUPs and recovery alerts****

 ** **

The test for the alert works fine - but how do you mimic a "recovered"
message with the --test option?  I don't think you can.****

** **

On 3 July 2011 00:29, Root, Paul <user-c80045f511e8@xymon.invalid> wrote:****

Have you tried using the test program to see how it acts for the failure?*
***

 ****

/usr/lib64/xymon/server/bin/hobbitd_alert --test myhost procs
--duration=500 |grep -v Failed****

 ****

Paul Root    - Engineer III  - Qwest is becoming CenturyLink****

 ****

*From:* Heather Keen [mailto:user-4ed3d6202a3b@xymon.invalid]
*Sent:* Saturday, July 02, 2011 12:51 PM
*To:* Root, Paul
*Cc:* xymon at xymon.com
*Subject:* Re: [Xymon] GROUPs and recovery alerts****

 ****

Yeah, I tried that too. No joy.****

On 1 July 2011 19:59, Root, Paul <user-c80045f511e8@xymon.invalid> wrote:****

I generally put the RECOVERED on the mail line.****

 ****

 ****

Paul Root    - Engineer III  - Qwest is becoming CenturyLink****

 ****

*From:* xymon-bounces at xymon.com [mailto:xymon-bounces at xymon.com] *On
Behalf Of *Heather Keen
*Sent:* Friday, July 01, 2011 11:47 AM
*To:* xymon at xymon.com
*Subject:* [Xymon] GROUPs and recovery alerts****

 ****

Hi,****

 ****

I want to be able to group servers so that alerts for one bunch of servers
go to one group of people, and another group of servesr to another group of
people.****

 ****

I used the following config -****

 ****

analysis.cfg:****

HOST=myhost.mydomain.com GROUP=mygroup****

     PROC blah****

     PROC blahblah****

     DISK ....etc****

 ****

alerts.cfg:****

GROUP=mygroup NOTICE RECOVERED COLOR=red****

        MAIL user-b75305ea6ec0@xymon.invalid REPEAT=15****

 ****

 ****

Now, I tested this by stopping one of the PROCs listed, and I successfully
received the alert e-mail.  However, when I restarted that process to clear
the alert, the status goes green but I do not receive any recovery message.
****

 ****

I also tried just having the GROUP defined against each individual PROC
line (rather than against the HOST), but that didn't result in a recovery
message either.****

 ****

Am I doing something wrong, or is this a bug?****

 ****

I can provide the debug output from the alert process if required.****

 ****

Cheers,****

Heather****

 ****

This communication is the property of Qwest and may contain confidential or
privileged information. Unauthorized use of this communication is strictly
prohibited and may be unlawful. If you have received this communication
in error, please immediately notify the sender by reply e-mail and destroy
all copies of the communication and any attachments.****

 ****

** **

This communication is the property of Qwest and may contain confidential or
privileged information. Unauthorized use of this communication is strictly
prohibited and may be unlawful. If you have received this communication
in error, please immediately notify the sender by reply e-mail and destroy
all copies of the communication and any attachments.****

** **

This communication is the property of Qwest and may contain confidential or
privileged information. Unauthorized use of this communication is strictly
prohibited and may be unlawful. If you have received this communication
in error, please immediately notify the sender by reply e-mail and destroy
all copies of the communication and any attachments.
list Asif Iqbal · Mon, 4 Jul 2011 12:58:24 -0400 ·
quoted from Heather Keen
On Mon, Jul 4, 2011 at 12:19 PM, Heather Keen <user-4ed3d6202a3b@xymon.invalid> wrote:
NOTICE means that when you enable or disable an alert you'll get a msg.
I can get RECOVERED to work on HOST or MAIL or SCRIPT lines, just as long as
they aren't in a GROUP.
RECOVERED and COLOR=RED on same line?
quoted from Heather Keen

On 4 July 2011 16:48, Root, Paul <user-c80045f511e8@xymon.invalid> wrote:
You can’t.


I’ve never used notice. What does it do?


I’ve never had any luck with recovered on anything except the Mail line.


Paul Root    - Engineer III  - Qwest is becoming CenturyLink


From: Heather Keen [mailto:user-4ed3d6202a3b@xymon.invalid]
Sent: Monday, July 04, 2011 5:13 AM

To: Root, Paul
Cc: xymon at xymon.com
Subject: Re: [Xymon] GROUPs and recovery alerts


The test for the alert works fine - but how do you mimic a "recovered"
message with the --test option?  I don't think you can.


On 3 July 2011 00:29, Root, Paul <user-c80045f511e8@xymon.invalid> wrote:

Have you tried using the test program to see how it acts for the failure?


/usr/lib64/xymon/server/bin/hobbitd_alert --test myhost procs
--duration=500 |grep -v Failed


Paul Root    - Engineer III  - Qwest is becoming CenturyLink


From: Heather Keen [mailto:user-4ed3d6202a3b@xymon.invalid]
Sent: Saturday, July 02, 2011 12:51 PM
To: Root, Paul
Cc: xymon at xymon.com
Subject: Re: [Xymon] GROUPs and recovery alerts


Yeah, I tried that too. No joy.

On 1 July 2011 19:59, Root, Paul <user-c80045f511e8@xymon.invalid> wrote:

I generally put the RECOVERED on the mail line.


Paul Root    - Engineer III  - Qwest is becoming CenturyLink


From: xymon-bounces at xymon.com [mailto:xymon-bounces at xymon.com] On Behalf
Of Heather Keen
Sent: Friday, July 01, 2011 11:47 AM
To: xymon at xymon.com
Subject: [Xymon] GROUPs and recovery alerts


Hi,


I want to be able to group servers so that alerts for one bunch of servers
go to one group of people, and another group of servesr to another group of
people.


I used the following config -


analysis.cfg:

HOST=myhost.mydomain.com GROUP=mygroup

     PROC blah

     PROC blahblah

     DISK ....etc


alerts.cfg:

GROUP=mygroup NOTICE RECOVERED COLOR=red

        MAIL user-b75305ea6ec0@xymon.invalid REPEAT=15


Now, I tested this by stopping one of the PROCs listed, and I successfully
received the alert e-mail.  However, when I restarted that process to clear
the alert, the status goes green but I do not receive any recovery message.


I also tried just having the GROUP defined against each individual PROC
line (rather than against the HOST), but that didn't result in a recovery
message either.


Am I doing something wrong, or is this a bug?


I can provide the debug output from the alert process if required.


Cheers,

Heather


This communication is the property of Qwest and may contain confidential
or
privileged information. Unauthorized use of this communication is strictly
prohibited and may be unlawful. If you have received this communication
in error, please immediately notify the sender by reply e-mail and destroy
all copies of the communication and any attachments.


This communication is the property of Qwest and may contain confidential
or
privileged information. Unauthorized use of this communication is strictly
prohibited and may be unlawful. If you have received this communication
in error, please immediately notify the sender by reply e-mail and destroy
all copies of the communication and any attachments.


This communication is the property of Qwest and may contain confidential
or
privileged information. Unauthorized use of this communication is strictly
prohibited and may be unlawful. If you have received this communication
in error, please immediately notify the sender by reply e-mail and destroy
all copies of the communication and any attachments.
-- 

Asif Iqbal
PGP Key: 0xE62693C5 KeyServer: pgp.mit.edu
A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?
list Asif Iqbal · Mon, 4 Jul 2011 12:59:47 -0400 ·
On Mon, Jul 4, 2011 at 12:19 PM, Heather Keen <user-4ed3d6202a3b@xymon.invalid> wrote:
NOTICE means that when you enable or disable an alert you'll get a msg.
I can get RECOVERED to work on HOST or MAIL or SCRIPT lines, just as long as
they aren't in a GROUP.
RECOVERED and COLOR=RED on same line?

On 4 July 2011 16:48, Root, Paul <user-c80045f511e8@xymon.invalid> wrote:
You can’t.


I’ve never used notice. What does it do?


I’ve never had any luck with recovered on anything except the Mail line.


Paul Root    - Engineer III  - Qwest is becoming CenturyLink


From: Heather Keen [mailto:user-4ed3d6202a3b@xymon.invalid]
Sent: Monday, July 04, 2011 5:13 AM

To: Root, Paul
Cc: xymon at xymon.com
Subject: Re: [Xymon] GROUPs and recovery alerts


The test for the alert works fine - but how do you mimic a "recovered"
message with the --test option?  I don't think you can.


On 3 July 2011 00:29, Root, Paul <user-c80045f511e8@xymon.invalid> wrote:

Have you tried using the test program to see how it acts for the failure?


/usr/lib64/xymon/server/bin/hobbitd_alert --test myhost procs
--duration=500 |grep -v Failed


Paul Root    - Engineer III  - Qwest is becoming CenturyLink


From: Heather Keen [mailto:user-4ed3d6202a3b@xymon.invalid]
Sent: Saturday, July 02, 2011 12:51 PM
To: Root, Paul
Cc: xymon at xymon.com
Subject: Re: [Xymon] GROUPs and recovery alerts


Yeah, I tried that too. No joy.

On 1 July 2011 19:59, Root, Paul <user-c80045f511e8@xymon.invalid> wrote:

I generally put the RECOVERED on the mail line.


Paul Root    - Engineer III  - Qwest is becoming CenturyLink


From: xymon-bounces at xymon.com [mailto:xymon-bounces at xymon.com] On Behalf
Of Heather Keen
Sent: Friday, July 01, 2011 11:47 AM
To: xymon at xymon.com
Subject: [Xymon] GROUPs and recovery alerts


Hi,


I want to be able to group servers so that alerts for one bunch of servers
go to one group of people, and another group of servesr to another group of
people.


I used the following config -


analysis.cfg:

HOST=myhost.mydomain.com GROUP=mygroup

     PROC blah

     PROC blahblah

     DISK ....etc


alerts.cfg:

GROUP=mygroup NOTICE RECOVERED COLOR=red

        MAIL user-b75305ea6ec0@xymon.invalid REPEAT=15


Now, I tested this by stopping one of the PROCs listed, and I successfully
received the alert e-mail.  However, when I restarted that process to clear
the alert, the status goes green but I do not receive any recovery message.


I also tried just having the GROUP defined against each individual PROC
line (rather than against the HOST), but that didn't result in a recovery
message either.


Am I doing something wrong, or is this a bug?


I can provide the debug output from the alert process if required.


Cheers,

Heather


This communication is the property of Qwest and may contain confidential
or
privileged information. Unauthorized use of this communication is strictly
prohibited and may be unlawful. If you have received this communication
in error, please immediately notify the sender by reply e-mail and destroy
all copies of the communication and any attachments.


This communication is the property of Qwest and may contain confidential
or
privileged information. Unauthorized use of this communication is strictly
prohibited and may be unlawful. If you have received this communication
in error, please immediately notify the sender by reply e-mail and destroy
all copies of the communication and any attachments.


This communication is the property of Qwest and may contain confidential
or
privileged information. Unauthorized use of this communication is strictly
prohibited and may be unlawful. If you have received this communication
in error, please immediately notify the sender by reply e-mail and destroy
all copies of the communication and any attachments.
-- 
Asif Iqbal
PGP Key: 0xE62693C5 KeyServer: pgp.mit.edu
A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?
list Heather Keen · Tue, 5 Jul 2011 10:58:16 +0100 ·
Aye, there is nothing wrong with that.


Anyway, I think this is a BUG.

Xymon Version 4.3.3.
Configuration as follows:

analysis.cfg:
HOST=myhost.mydomain.com GROUP=heather
        PROC TESTtestTEST 1

(equally, you could have the GROUP entry on the PROC line, doesn't matter in
this instance as the result is the same)

alerts.cfg:
HOST=*
        MAIL user-f6063d158e7c@xymon.invalid RECOVERED

GROUP=heather
        MAIL user-498ed3e78c69@xymon.invalid RECOVERED


When the alert is generated, both e-mail addresses get the notification.
 But when the alert is cleared, only user-f6063d158e7c@xymon.invalid gets the recovery
message.

I've tried lots of different configuration options, and the only conclusion
I can come to is that recovery messages to GROUPs do not work.  :(
quoted from Asif Iqbal


On 4 July 2011 17:58, Asif Iqbal <user-6f4b51ac2a40@xymon.invalid> wrote:
On Mon, Jul 4, 2011 at 12:19 PM, Heather Keen <user-4ed3d6202a3b@xymon.invalid>
wrote:
NOTICE means that when you enable or disable an alert you'll get a msg.
I can get RECOVERED to work on HOST or MAIL or SCRIPT lines, just as long
as
they aren't in a GROUP.
RECOVERED and COLOR=RED on same line?

On 4 July 2011 16:48, Root, Paul <user-c80045f511e8@xymon.invalid> wrote:
You can’t.


I’ve never used notice. What does it do?


I’ve never had any luck with recovered on anything except the Mail line.


Paul Root    - Engineer III  - Qwest is becoming CenturyLink


From: Heather Keen [mailto:user-4ed3d6202a3b@xymon.invalid]
Sent: Monday, July 04, 2011 5:13 AM

To: Root, Paul
Cc: xymon at xymon.com
Subject: Re: [Xymon] GROUPs and recovery alerts


The test for the alert works fine - but how do you mimic a "recovered"
message with the --test option?  I don't think you can.


On 3 July 2011 00:29, Root, Paul <user-c80045f511e8@xymon.invalid> wrote:

Have you tried using the test program to see how it acts for the
failure?


/usr/lib64/xymon/server/bin/hobbitd_alert --test myhost procs
--duration=500 |grep -v Failed


Paul Root    - Engineer III  - Qwest is becoming CenturyLink


From: Heather Keen [mailto:user-4ed3d6202a3b@xymon.invalid]
Sent: Saturday, July 02, 2011 12:51 PM
To: Root, Paul
Cc: xymon at xymon.com
Subject: Re: [Xymon] GROUPs and recovery alerts


Yeah, I tried that too. No joy.

On 1 July 2011 19:59, Root, Paul <user-c80045f511e8@xymon.invalid> wrote:

I generally put the RECOVERED on the mail line.


Paul Root    - Engineer III  - Qwest is becoming CenturyLink


From: xymon-bounces at xymon.com [mailto:xymon-bounces at xymon.com] On
Behalf
Of Heather Keen
Sent: Friday, July 01, 2011 11:47 AM
To: xymon at xymon.com
Subject: [Xymon] GROUPs and recovery alerts


Hi,


I want to be able to group servers so that alerts for one bunch of
servers
go to one group of people, and another group of servesr to another group
of
people.


I used the following config -


analysis.cfg:

HOST=myhost.mydomain.com GROUP=mygroup

     PROC blah

     PROC blahblah

     DISK ....etc


alerts.cfg:

GROUP=mygroup NOTICE RECOVERED COLOR=red

        MAIL user-b75305ea6ec0@xymon.invalid REPEAT=15


Now, I tested this by stopping one of the PROCs listed, and I
successfully
received the alert e-mail.  However, when I restarted that process to
clear
the alert, the status goes green but I do not receive any recovery
message.


I also tried just having the GROUP defined against each individual PROC
line (rather than against the HOST), but that didn't result in a
recovery
message either.


Am I doing something wrong, or is this a bug?


I can provide the debug output from the alert process if required.


Cheers,

Heather


This communication is the property of Qwest and may contain confidential
or
privileged information. Unauthorized use of this communication is
strictly
prohibited and may be unlawful. If you have received this communication
in error, please immediately notify the sender by reply e-mail and
destroy
all copies of the communication and any attachments.


This communication is the property of Qwest and may contain confidential
or
privileged information. Unauthorized use of this communication is
strictly
prohibited and may be unlawful. If you have received this communication
in error, please immediately notify the sender by reply e-mail and
destroy
all copies of the communication and any attachments.


This communication is the property of Qwest and may contain confidential
or
privileged information. Unauthorized use of this communication is
strictly
prohibited and may be unlawful. If you have received this communication
in error, please immediately notify the sender by reply e-mail and
destroy
all copies of the communication and any attachments.
--
Asif Iqbal
PGP Key: 0xE62693C5 KeyServer: pgp.mit.edu
A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?
list Dominique Frise · Tue, 05 Jul 2011 12:45:55 +0200 ·
I'm afraid you are right !
We have also noticed that RECOVERED msgs do not get sent to GROUPs 
(using Xymon-4.3.2)

Dominique
quoted from Heather Keen

On 07/ 5/11 11:58 AM, Heather Keen wrote:
Aye, there is nothing wrong with that.


Anyway, I think this is a BUG.

Xymon Version 4.3.3.
Configuration as follows:

analysis.cfg:

HOST=myhost.mydomain.com <http://myhost.mydomain.com>; GROUP=heather
quoted from Heather Keen
         PROC TESTtestTEST 1

(equally, you could have the GROUP entry on the PROC line, doesn't
matter in this instance as the result is the same)

alerts.cfg:
HOST=*

         MAIL user-f6063d158e7c@xymon.invalid <mailto:user-f6063d158e7c@xymon.invalid> RECOVERED

GROUP=heather
         MAIL user-498ed3e78c69@xymon.invalid <mailto:user-498ed3e78c69@xymon.invalid> RECOVERED
quoted from Heather Keen


When the alert is generated, both e-mail addresses get the notification.
  But when the alert is cleared, only user-f6063d158e7c@xymon.invalid

<mailto:user-f6063d158e7c@xymon.invalid> gets the recovery message.
quoted from Heather Keen

I've tried lots of different configuration options, and the only
conclusion I can come to is that recovery messages to GROUPs do not
work.  :(


On 4 July 2011 17:58, Asif Iqbal <user-6f4b51ac2a40@xymon.invalid
<mailto:user-6f4b51ac2a40@xymon.invalid>> wrote:

    On Mon, Jul 4, 2011 at 12:19 PM, Heather Keen <user-4ed3d6202a3b@xymon.invalid
    <mailto:user-4ed3d6202a3b@xymon.invalid>> wrote:
NOTICE means that when you enable or disable an alert you'll get
    a msg.
I can get RECOVERED to work on HOST or MAIL or SCRIPT lines, just
    as long as
they aren't in a GROUP.
    RECOVERED and COLOR=RED on same line?

On 4 July 2011 16:48, Root, Paul <user-c80045f511e8@xymon.invalid
    <mailto:user-c80045f511e8@xymon.invalid>> wrote:
You can’t.


I’ve never used notice. What does it do?


I’ve never had any luck with recovered on anything except the
    Mail line.

Paul Root    - Engineer III  - Qwest is becoming CenturyLink


From: Heather Keen [mailto:user-4ed3d6202a3b@xymon.invalid
    <mailto:user-4ed3d6202a3b@xymon.invalid>]
Sent: Monday, July 04, 2011 5:13 AM

To: Root, Paul

Cc: xymon at xymon.com <mailto:xymon at xymon.com>
quoted from Heather Keen
Subject: Re: [Xymon] GROUPs and recovery alerts


The test for the alert works fine - but how do you mimic a
    "recovered"
message with the --test option?  I don't think you can.


On 3 July 2011 00:29, Root, Paul <user-c80045f511e8@xymon.invalid
    <mailto:user-c80045f511e8@xymon.invalid>> wrote:
Have you tried using the test program to see how it acts for the
    failure?


/usr/lib64/xymon/server/bin/hobbitd_alert --test myhost procs
--duration=500 |grep -v Failed


Paul Root    - Engineer III  - Qwest is becoming CenturyLink


From: Heather Keen [mailto:user-4ed3d6202a3b@xymon.invalid
    <mailto:user-4ed3d6202a3b@xymon.invalid>]
Sent: Saturday, July 02, 2011 12:51 PM
To: Root, Paul

Cc: xymon at xymon.com <mailto:xymon at xymon.com>
quoted from Heather Keen
Subject: Re: [Xymon] GROUPs and recovery alerts


Yeah, I tried that too. No joy.

On 1 July 2011 19:59, Root, Paul <user-c80045f511e8@xymon.invalid
    <mailto:user-c80045f511e8@xymon.invalid>> wrote:
I generally put the RECOVERED on the mail line.


Paul Root    - Engineer III  - Qwest is becoming CenturyLink


From: xymon-bounces at xymon.com <mailto:xymon-bounces at xymon.com>
    [mailto:xymon-bounces at xymon.com <mailto:xymon-bounces at xymon.com>] On
    Behalf
Of Heather Keen
Sent: Friday, July 01, 2011 11:47 AM
To: xymon at xymon.com <mailto:xymon at xymon.com>
quoted from Heather Keen
Subject: [Xymon] GROUPs and recovery alerts


Hi,


I want to be able to group servers so that alerts for one bunch
    of servers
go to one group of people, and another group of servesr to
    another group of
people.


I used the following config -


analysis.cfg:

HOST=myhost.mydomain.com <http://myhost.mydomain.com>; GROUP=mygroup

     PROC blah

     PROC blahblah

     DISK ....etc


alerts.cfg:

GROUP=mygroup NOTICE RECOVERED COLOR=red

        MAIL user-b75305ea6ec0@xymon.invalid <mailto:user-b75305ea6ec0@xymon.invalid> REPEAT=15
quoted from Heather Keen


Now, I tested this by stopping one of the PROCs listed, and I
    successfully
received the alert e-mail.  However, when I restarted that
    process to clear
the alert, the status goes green but I do not receive any
    recovery message.


I also tried just having the GROUP defined against each
    individual PROC
line (rather than against the HOST), but that didn't result in a
    recovery
message either.


Am I doing something wrong, or is this a bug?


I can provide the debug output from the alert process if required.


Cheers,

Heather


This communication is the property of Qwest and may contain
    confidential
or
privileged information. Unauthorized use of this communication
    is strictly
prohibited and may be unlawful. If you have received this
    communication
in error, please immediately notify the sender by reply e-mail
    and destroy
all copies of the communication and any attachments.


This communication is the property of Qwest and may contain
    confidential
or
privileged information. Unauthorized use of this communication
    is strictly
prohibited and may be unlawful. If you have received this
    communication
in error, please immediately notify the sender by reply e-mail
    and destroy
all copies of the communication and any attachments.


This communication is the property of Qwest and may contain
    confidential
or
privileged information. Unauthorized use of this communication
    is strictly
prohibited and may be unlawful. If you have received this
    communication
in error, please immediately notify the sender by reply e-mail
    and destroy
all copies of the communication and any attachments.
    --
    Asif Iqbal

    PGP Key: 0xE62693C5 KeyServer: pgp.mit.edu <http://pgp.mit.edu>;
quoted from Heather Keen
    A: Because it messes up the order in which people normally read text.
    Q: Why is top-posting such a bad thing?

list Henrik Størner · Mon, 01 Aug 2011 17:14:00 +0200 ·
quoted from Heather Keen
On 05-07-2011 11:58, Heather Keen wrote:
Anyway, I think this is a BUG.

Xymon Version 4.3.3.
Configuration as follows:

analysis.cfg:
HOST=myhost.mydomain.com GROUP=heather
         PROC TESTtestTEST 1

alerts.cfg:
HOST=*
         MAIL user-f6063d158e7c@xymon.invalid RECOVERED

GROUP=heather
         MAIL user-498ed3e78c69@xymon.invalid RECOVERED


When the alert is generated, both e-mail addresses get the notification.
  But when the alert is cleared, only user-f6063d158e7c@xymon.invalid
<mailto:user-f6063d158e7c@xymon.invalid> gets the recovery message.

I've tried lots of different configuration options, and the only
conclusion I can come to is that recovery messages to GROUPs do not
work.  :(
It's certainly not what you would expect - must agree with that. But solving it is not quite as easy as one would expect.

The problem is that when the PROC triggers a red status, Xymon knows that the rule was one that included a "GROUP=heather" setting. But when the recovery happens, it is because none of the rules in analysis.cfg triggered. So Xymon does not know that the green status is a recovery from a rule that contained the GROUP setting.

There is some state lost here.

To solve this, the xymond_alert module will have to keep track of the active alerts, and which GROUP settings triggered them. When the recovery happens, it will then use that list of groups that received the alert as the basis for sending out the recovered-notices.

It can be solved, of course. Just don't be disappointed when you see 4.3.4 being released later today without a fix for this problem.


Regards,
Henrik
list Henrik Størner · Mon, 01 Aug 2011 17:37:03 +0200 ·
quoted from Henrik Størner
On 01-08-2011 17:14, Henrik Størner wrote:
On 05-07-2011 11:58, Heather Keen wrote:
I've tried lots of different configuration options, and the only
conclusion I can come to is that recovery messages to GROUPs do not
work. :(
It's certainly not what you would expect - must agree with that. But
solving it is not quite as easy as one would expect.
After looking at this once again, I actually think there is a very simple solution to this after all. If we don't check the GROUP rules at all for recovery-messages (i.e. any group setting will match), then xymond_alert will consider all the possible recipients. However, there is another check so it only sends recovery-messages to those recipients that actually did receive the alert. So I think the attached patch should solve this.

Regards,
Henrik
list Heather Keen · Thu, 19 Jan 2012 16:47:11 +0000 ·
quoted from Henrik Størner
On 1 August 2011 16:37, Henrik Størner <user-ce4a2c883f75@xymon.invalid> wrote:
On 01-08-2011 17:14, Henrik Størner wrote:
On 05-07-2011 11:58, Heather Keen wrote:
I've tried lots of different configuration options, and the only
conclusion I can come to is that recovery messages to GROUPs do not
work. :(
It's certainly not what you would expect - must agree with that. But
solving it is not quite as easy as one would expect.
After looking at this once again, I actually think there is a very simple
solution to this after all. If we don't check the GROUP rules at all for
recovery-messages (i.e. any group setting will match), then xymond_alert
will consider all the possible recipients. However, there is another check
so it only sends recovery-messages to those recipients that actually did
receive the alert. So I think the attached patch should solve this.

Regards,
Henrik
Henrik,

I've been doing a bit more testing with alerts using GROUPS, and I've
discovered a slight flaw with this solution, when you are using SCRIPT as
the recipient rather than MAIL.  Because it doesn't check the GROUP when it
sends a RECOVERED message, you can end up getting multiple RECOVERED
messages sent to the same person.  (tested with v4.3.7)

For example:

GROUP=A SERVICE=procs RECOVERED COLOR=red
     SCRIPT /home/xymon/server/ext/sms_notification 447777123456 FORMAT=SMS
DURATION>5
GROUP=B SERVICE=procs RECOVERED COLOR=red
     SCRIPT /home/xymon/server/ext/sms_notification 447777123456 FORMAT=SMS
DURATION>10

So you've got two groups of machines, each having the same recipient,
but needing a different alert delay.
Now, if procs goes red on a machine in group A, the red alert is handled
fine, but when it recovers, 447777123456 actually gets two   recovery
messages.

Note this only happens if the recipient is a SCRIPT command, it works fine
if you use MAIL recipients.

Help!

Cheers,
Heather