Double recovery messages
list Kevin Hanrahan
Hi,
I am getting double recovery messages. When I get a failed test, I get a
single message...as I should but when the service recovers, I get double
recovery messages. I have all my alert rules setup as such:
$SECURITY=%server1|server2|server3
$SYSADMIN=user-490a14721d84@xymon.invalid
HOST=$SECURITY
MAIL $SYSADMIN COLOR=red EXSERVICE=msgs,cpu REPEAT=30m RECOVERED
MAIL $SYSADMIN COLOR=red SERVICE=cpu DURATION>20 REPEAT=30m
RECOVERED
MAIL $SYSADMIN COLOR=purple REPEAT=1h RECOVERED
Can anybody see an error I made in syntax? Does anybody else get this
symptom?
Thanks
Kevin
Note: The information contained in this email and in any attachments is
intended only for the person or entity to which it is addressed and may
contain confidential and/or privileged material. Any review,
retransmission, dissemination or other use of, or taking of any action in
reliance upon, this information by persons or entities other than the
intended recipient is prohibited. The recipient should check this email and
any attachments for the presence of viruses. Sender accepts no liability
for any damages caused by any virus transmitted by this email. If you have
received this email in error, please notify us immediately by replying to
the message and delete the email from your computer. This e-mail is and any
response to it will be unencrypted and, therefore, potentially unsecure.
Thank you. NOVA Information Systems, Inc.
list Henrik Størner
▸
On Mon, Mar 07, 2005 at 01:26:28PM -0500, user-fd47fec4b039@xymon.invalid wrote:
I am getting double recovery messages. When I get a failed test, I get a single message...as I should but when the service recovers, I get double recovery messages.
Could you try adding the --cfid option on the hobbitd_alert command in hobbitlaunch.cfg ? I'd like to know which lines exactly in the setup is triggering these alerts - the "[cfid:LINENUMBER]" in the messages will let me know. I have a hunch that it might be the last line (for the purple tests) that is causing the duplicate, but I'd like to know for certain. I'll try replicating this tomorrow - it's too late now for serious debugging. Henrik
list Kevin Hanrahan
Henrik,
You were correct. I got cfid:210 and cfid:212 for one of the double
recoveries which correspond to:
HOST=$AV
MAIL $SYSADMIN COLOR=red EXSERVICE=msgs,cpu REPEAT=30m RECOVERED
(210)
MAIL $SYSADMIN COLOR=red SERVICE=cpu DURATION>20 REPEAT=30m
RECOVERED (211)
MAIL $SYSADMIN COLOR=purple REPEAT=1h RECOVERED
(212)
What is wrong with line 212? Could it be that because I did not specify ANY
service that it corresponds
to ALL services?....is there a better way to do this? I want to put the
purple alerts on a longer repeat
Timer if possible. Please advise.
Kevin
▸
-----Original Message-----
From: Henrik Stoerner [mailto:user-ce4a2c883f75@xymon.invalid]
Sent: Monday, March 07, 2005 6:14 PM
To: user-ae9b8668bcde@xymon.invalid
Subject: Re: [hobbit] Double recovery messages
On Mon, Mar 07, 2005 at 01:26:28PM -0500, user-fd47fec4b039@xymon.invalid wrote:
I am getting double recovery messages. When I get a failed test, I get a single message...as I should but when the service recovers, I get double recovery messages.
Could you try adding the --cfid option on the hobbitd_alert command in hobbitlaunch.cfg ? I'd like to know which lines exactly in the setup is triggering these alerts - the "[cfid:LINENUMBER]" in the messages will let me know. I have a hunch that it might be the last line (for the purple tests) that is causing the duplicate, but I'd like to know for certain. I'll try replicating this tomorrow - it's too late now for serious debugging. Henrik Note: The information contained in this email and in any attachments is intended only for the person or entity to which it is addressed and may contain confidential and/or privileged material. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipient is prohibited. The recipient should check this email and any attachments for the presence of viruses. Sender accepts no liability for any damages caused by any virus transmitted by this email. If you have received this email in error, please notify us immediately by replying to the message and delete the email from your computer. This e-mail is and any response to it will be unencrypted and, therefore, potentially unsecure. Thank you. NOVA Information Systems, Inc.
list Henrik Størner
▸
On Mon, Mar 07, 2005 at 10:18:08PM -0500, user-fd47fec4b039@xymon.invalid wrote:
You were correct. I got cfid:210 and cfid:212 for one of the double
recoveries which correspond to:
HOST=$AV
MAIL $SYSADMIN COLOR=red EXSERVICE=msgs,cpu REPEAT=30m RECOVERED (210)
MAIL $SYSADMIN COLOR=red SERVICE=cpu DURATION>20 REPEAT=30m RECOVERED (211)
MAIL $SYSADMIN COLOR=purple REPEAT=1h RECOVERED (212)
What is wrong with line 212? Nothing wrong with your configuration. I thought about this last night, and I'm pretty sure it is a bug in how hobbit decides which recovery messages to send out. I'll look at this later today. Henrik