Xymon Mailing List Archive search

Alert transition yellow -> red with repeat problem

8 messages in this thread

list John Rothlisberger · Wed, 29 Mar 2017 12:26:21 +0000 ·
This is a problem I have seen for a long long time and have actually brought it up on the list before.
Xymon 4.3.21
Ubuntu 14.04LTS

The problem I have (in this instance) a warning which is to be repeated daily suddenly goes red and triggers a single alert but doesn't repeat again until the repeat time of the warning has passed.

From the notification.log:
Tue Mar 28 05:31:28 2017 ServerA.disk (IP) disk_warn 1490697082 100 <- warning sets a repeat time of 1 day
Tue Mar 28 05:53:32 2017 ServerA.disk (IP) disk_alert 1490698405 100 <- minutes later it goes red (red repeat time is 15 minutes but no further alerts are generated)
Next alert comes out 1 day after the above warning:
Wed Mar 29 05:31:31 2017 ServerA.disk (IP) disk_alert 1490783486 100 <- 1 day after previous warning.  This should have been repeated every 15 minutes.
Wed Mar 29 05:46:40 2017 ServerA.disk (IP) disk_alert 1490784394 100 <- now, the repeat time is 15 minutes
... <- and continues every 15 minutes.

This alert went a full 24 hours with only a single notification.  :(
I have seen this before (not always) where the repeat time in a warning overrides a follow up alert until the warning repeat time has expired.

Alert rules:
   SCRIPT /home/xymon/server/ext/pg/exwarn_SCRIPT disk_warn DURATION>30 REPEAT=1d COLOR=yellow SERVICE=disk FORMAT=TEXT UNMATCHED
   SCRIPT /home/xymon/server/ext/pg/exalert_SCRIPT disk_alert DURATION>20 REPEAT=15 COLOR=red SERVICE=disk FORMAT=TEXT UNMATCHED

Ideas/thoughts?
Thanks,
John
Upcoming PTO:  4/3
John Rothlisberger
IT Strategy, Infrastructure & Security - Technology Growth Platform
TGP for Business Process Outsourcing
Accenture
XXX.XXX.XXXX office


This message is for the designated recipient only and may contain privileged, proprietary, or otherwise confidential information. If you have received it in error, please notify the sender immediately and delete the original. Any other use of the e-mail by you is prohibited. Where allowed by local law, electronic communications with Accenture and its affiliates, including e-mail and instant messaging (including content), may be scanned by our systems for the purposes of information security and assessment of internal compliance with Accenture policy.

www.accenture.com
list John Rothlisberger · Tue, 11 Apr 2017 12:49:04 +0000 ·
This is a problem I have seen for a long long time and have actually brought it up on the list before.
Xymon 4.3.21
Ubuntu 14.04LTS

The problem I have (in this instance) a warning which is to be repeated daily suddenly goes red and triggers a single alert but doesn't repeat again until the repeat time of the warning has passed.

From the notification.log:
Tue Mar 28 05:31:28 2017 ServerA.disk (IP) disk_warn 1490697082 100 <- warning sets a repeat time of 1 day
Tue Mar 28 05:53:32 2017 ServerA.disk (IP) disk_alert 1490698405 100 <- minutes later it goes red (red repeat time is 15 minutes but no further alerts are generated)
Next alert comes out 1 day after the above warning:
Wed Mar 29 05:31:31 2017 ServerA.disk (IP) disk_alert 1490783486 100 <- 1 day after previous warning.  This should have been repeated every 15 minutes.
Wed Mar 29 05:46:40 2017 ServerA.disk (IP) disk_alert 1490784394 100 <- now, the repeat time is 15 minutes
... <- and continues every 15 minutes.

This alert went a full 24 hours with only a single notification.  :(
I have seen this before (not always) where the repeat time in a warning overrides a follow up alert until the warning repeat time has expired.

Alert rules:
   SCRIPT /home/xymon/server/ext/pg/exwarn_SCRIPT disk_warn DURATION>30 REPEAT=1d COLOR=yellow SERVICE=disk FORMAT=TEXT UNMATCHED
   SCRIPT /home/xymon/server/ext/pg/exalert_SCRIPT disk_alert DURATION>20 REPEAT=15 COLOR=red SERVICE=disk FORMAT=TEXT UNMATCHED

Ideas/thoughts?


Thanks,
John
Upcoming PTO:  4/3
John Rothlisberger
IT Strategy, Infrastructure & Security - Technology Growth Platform
TGP for Business Process Outsourcing
Accenture
XXX.XXX.XXXX office


This message is for the designated recipient only and may contain privileged, proprietary, or otherwise confidential information. If you have received it in error, please notify the sender immediately and delete the original. Any other use of the e-mail by you is prohibited. Where allowed by local law, electronic communications with Accenture and its affiliates, including e-mail and instant messaging (including content), may be scanned by our systems for the purposes of information security and assessment of internal compliance with Accenture policy.

www.accenture.com
list John Rothlisberger · Fri, 28 Apr 2017 18:40:07 +0000 ·
This problem contributed to an outage last night - something is wrong.

Last night we had a disk that was in the warning state - warning email sent.

That disk then went into an alert state and an alert email was triggered (right away as the DURATION value took into account the time it was yellow) and then again 15 minutes later as designed.

THEN - another disk on that server went yellow.  It did NOT trigger any emails (as expected we should continue to focus on the alerts) but it somehow interfered with the REPEAT time of the alerts and those STOPPED.

There is a bug somewhere.

Thanks,
John
Upcoming PTO:
signature
John Rothlisberger
IT Strategy, Infrastructure & Security - Technology Growth Platform
TGP for Business Process Outsourcing
Accenture
XXX.XXX.XXXX office

quoted from John Rothlisberger
From: Rothlisberger, John R.
Sent: Tuesday, April 11, 2017 7:49 AM
To: 'xymon >> xymon at xymon.com' <xymon at xymon.com>
Subject: Alert transition yellow -> red with repeat problem

This is a problem I have seen for a long long time and have actually brought it up on the list before.
Xymon 4.3.21
Ubuntu 14.04LTS

The problem I have (in this instance) a warning which is to be repeated daily suddenly goes red and triggers a single alert but doesn't repeat again until the repeat time of the warning has passed.

From the notification.log:
Tue Mar 28 05:31:28 2017 ServerA.disk (IP) disk_warn 1490697082 100 <- warning sets a repeat time of 1 day
Tue Mar 28 05:53:32 2017 ServerA.disk (IP) disk_alert 1490698405 100 <- minutes later it goes red (red repeat time is 15 minutes but no further alerts are generated)
Next alert comes out 1 day after the above warning:
Wed Mar 29 05:31:31 2017 ServerA.disk (IP) disk_alert 1490783486 100 <- 1 day after previous warning.  This should have been repeated every 15 minutes.
Wed Mar 29 05:46:40 2017 ServerA.disk (IP) disk_alert 1490784394 100 <- now, the repeat time is 15 minutes
... <- and continues every 15 minutes.

This alert went a full 24 hours with only a single notification.  :(
I have seen this before (not always) where the repeat time in a warning overrides a follow up alert until the warning repeat time has expired.

Alert rules:
   SCRIPT /home/xymon/server/ext/pg/exwarn_SCRIPT disk_warn DURATION>30 REPEAT=1d COLOR=yellow SERVICE=disk FORMAT=TEXT UNMATCHED
   SCRIPT /home/xymon/server/ext/pg/exalert_SCRIPT disk_alert DURATION>20 REPEAT=15 COLOR=red SERVICE=disk FORMAT=TEXT UNMATCHED

Ideas/thoughts?


Thanks,
John
Upcoming PTO:  4/3
John Rothlisberger
IT Strategy, Infrastructure & Security - Technology Growth Platform
TGP for Business Process Outsourcing
Accenture
XXX.XXX.XXXX office


This message is for the designated recipient only and may contain privileged, proprietary, or otherwise confidential information. If you have received it in error, please notify the sender immediately and delete the original. Any other use of the e-mail by you is prohibited. Where allowed by local law, electronic communications with Accenture and its affiliates, including e-mail and instant messaging (including content), may be scanned by our systems for the purposes of information security and assessment of internal compliance with Accenture policy.

www.accenture.com
list John Rothlisberger · Thu, 1 Jun 2017 18:03:29 +0000 ·
This problem continues and I can't seem to get anybody's attention.

Has anyone seen this problem before?????????
signature

Thanks,
John
Upcoming PTO:
John Rothlisberger
IT Strategy, Infrastructure & Security - Technology Growth Platform
TGP for Business Process Outsourcing
Accenture
XXX.XXX.XXXX office

quoted from John Rothlisberger
From: Rothlisberger, John R.
Sent: Friday, April 28, 2017 1:40 PM
To: 'xymon >> xymon at xymon.com' <xymon at xymon.com>
Subject: RE: Alert transition yellow -> red with repeat problem

This problem contributed to an outage last night - something is wrong.

Last night we had a disk that was in the warning state - warning email sent.

That disk then went into an alert state and an alert email was triggered (right away as the DURATION value took into account the time it was yellow) and then again 15 minutes later as designed.

THEN - another disk on that server went yellow.  It did NOT trigger any emails (as expected we should continue to focus on the alerts) but it somehow interfered with the REPEAT time of the alerts and those STOPPED.

There is a bug somewhere.

Thanks,
John
Upcoming PTO:
John Rothlisberger
IT Strategy, Infrastructure & Security - Technology Growth Platform
TGP for Business Process Outsourcing
Accenture
XXX.XXX.XXXX office

From: Rothlisberger, John R.
Sent: Tuesday, April 11, 2017 7:49 AM
To: 'xymon >> xymon at xymon.com<mailto:xymon at xymon.com>' <xymon at xymon.com<mailto:xymon at xymon.com>>
Subject: Alert transition yellow -> red with repeat problem

This is a problem I have seen for a long long time and have actually brought it up on the list before.
Xymon 4.3.21
Ubuntu 14.04LTS

The problem I have (in this instance) a warning which is to be repeated daily suddenly goes red and triggers a single alert but doesn't repeat again until the repeat time of the warning has passed.

From the notification.log:
Tue Mar 28 05:31:28 2017 ServerA.disk (IP) disk_warn 1490697082 100 <- warning sets a repeat time of 1 day
Tue Mar 28 05:53:32 2017 ServerA.disk (IP) disk_alert 1490698405 100 <- minutes later it goes red (red repeat time is 15 minutes but no further alerts are generated)
Next alert comes out 1 day after the above warning:
Wed Mar 29 05:31:31 2017 ServerA.disk (IP) disk_alert 1490783486 100 <- 1 day after previous warning.  This should have been repeated every 15 minutes.
Wed Mar 29 05:46:40 2017 ServerA.disk (IP) disk_alert 1490784394 100 <- now, the repeat time is 15 minutes
... <- and continues every 15 minutes.

This alert went a full 24 hours with only a single notification.  :(
I have seen this before (not always) where the repeat time in a warning overrides a follow up alert until the warning repeat time has expired.

Alert rules:
   SCRIPT /home/xymon/server/ext/pg/exwarn_SCRIPT disk_warn DURATION>30 REPEAT=1d COLOR=yellow SERVICE=disk FORMAT=TEXT UNMATCHED
   SCRIPT /home/xymon/server/ext/pg/exalert_SCRIPT disk_alert DURATION>20 REPEAT=15 COLOR=red SERVICE=disk FORMAT=TEXT UNMATCHED

Ideas/thoughts?


Thanks,
John
Upcoming PTO:  4/3
John Rothlisberger
IT Strategy, Infrastructure & Security - Technology Growth Platform
TGP for Business Process Outsourcing
Accenture
XXX.XXX.XXXX office


This message is for the designated recipient only and may contain privileged, proprietary, or otherwise confidential information. If you have received it in error, please notify the sender immediately and delete the original. Any other use of the e-mail by you is prohibited. Where allowed by local law, electronic communications with Accenture and its affiliates, including e-mail and instant messaging (including content), may be scanned by our systems for the purposes of information security and assessment of internal compliance with Accenture policy.

www.accenture.com
list Josh Luthman · Thu, 1 Jun 2017 14:21:29 -0400 ·
It's a known issue.  I don't believe I've ever seen any kind of resolution.


Josh Luthman
Office: XXX-XXX-XXXX
Direct: XXX-XXX-XXXX
XXXX Wayne St
Suite XXXX
Troy, OH XXXXX
quoted from John Rothlisberger

On Thu, Jun 1, 2017 at 2:03 PM, <user-7adce57665bb@xymon.invalid> wrote:
This problem continues and I can’t seem to get anybody’s attention.


Has anyone seen this problem before?????????


Thanks,

John

Upcoming PTO:


John Rothlisberger

IT Strategy, Infrastructure & Security - Technology Growth Platform

TGP for Business Process Outsourcing

Accenture

XXX.XXX.XXXX <(312)%20693-3136> office
quoted from John Rothlisberger


*From:* Rothlisberger, John R.
*Sent:* Friday, April 28, 2017 1:40 PM
*To:* 'xymon >> xymon at xymon.com' <xymon at xymon.com>
*Subject:* RE: Alert transition yellow -> red with repeat problem


This problem contributed to an outage last night – something is wrong.


Last night we had a disk that was in the warning state – warning email
sent.


That disk then went into an alert state and an alert email was triggered
(right away as the DURATION value took into account the time it was yellow)
and then again 15 minutes later as designed.


THEN – another disk on that server went yellow.  It did NOT trigger any
emails (as expected we should continue to focus on the alerts) but it
somehow interfered with the REPEAT time of the alerts and those STOPPED.


There is a bug somewhere.


Thanks,

John

Upcoming PTO:


John Rothlisberger

IT Strategy, Infrastructure & Security - Technology Growth Platform

TGP for Business Process Outsourcing

Accenture

XXX.XXX.XXXX <(312)%20693-3136> office
quoted from John Rothlisberger


*From:* Rothlisberger, John R.
*Sent:* Tuesday, April 11, 2017 7:49 AM
*To:* 'xymon >> xymon at xymon.com' <xymon at xymon.com>
*Subject:* Alert transition yellow -> red with repeat problem


This is a problem I have seen for a long long time and have actually
brought it up on the list before.

Xymon 4.3.21

Ubuntu 14.04LTS


The problem I have (in this instance) a warning which is to be repeated
daily suddenly goes red and triggers a single alert but doesn’t repeat
again until the repeat time of the warning has passed.


From the notification.log:

Tue Mar 28 05:31:28 2017 ServerA.disk (IP) disk_warn 1490697082 100 <-
warning sets a repeat time of 1 day

Tue Mar 28 05:53:32 2017 ServerA.disk (IP) disk_alert 1490698405 100 <-
minutes later it goes red (red repeat time is 15 minutes but no further
alerts are generated)

Next alert comes out 1 day after the above warning:

Wed Mar 29 05:31:31 2017 ServerA.disk (IP) disk_alert 1490783486 100 <- 1
day after previous warning.  This should have been repeated every 15
minutes.

Wed Mar 29 05:46:40 2017 ServerA.disk (IP) disk_alert 1490784394 100 <-
now, the repeat time is 15 minutes

… <- and continues every 15 minutes.


This alert went a full 24 hours with only a single notification.  L
quoted from John Rothlisberger

I have seen this before (not always) where the repeat time in a warning
overrides a follow up alert until the warning repeat time has expired.


Alert rules:

   SCRIPT /home/xymon/server/ext/pg/exwarn_SCRIPT disk_warn DURATION>30
REPEAT=1d COLOR=yellow SERVICE=disk FORMAT=TEXT UNMATCHED

   SCRIPT /home/xymon/server/ext/pg/exalert_SCRIPT disk_alert DURATION>20
REPEAT=15 COLOR=red SERVICE=disk FORMAT=TEXT UNMATCHED


Ideas/thoughts?


Thanks,

John

Upcoming PTO:  4/3


John Rothlisberger

IT Strategy, Infrastructure & Security - Technology Growth Platform

TGP for Business Process Outsourcing

Accenture

XXX.XXX.XXXX <(312)%20693-3136> office
quoted from John Rothlisberger


This message is for the designated recipient only and may contain
privileged, proprietary, or otherwise confidential information. If you have
received it in error, please notify the sender immediately and delete the
original. Any other use of the e-mail by you is prohibited. Where allowed
by local law, electronic communications with Accenture and its affiliates,
including e-mail and instant messaging (including content), may be scanned
by our systems for the purposes of information security and assessment of
internal compliance with Accenture policy.

www.accenture.com

list John Rothlisberger · Tue, 21 Nov 2017 13:17:39 +0000 ·
Is anyone looking to fix this issue?

It keeps popping up and folks are losing confidence in Xymon.  It’s hard not to when something goes from yellow->red and there are no alerts generated.

To recap:

  *   Test goes yellow
  *   Duration passes
  *   Warning sent
  *   Repeat time set to X (in my case 8 hours)
  *   Test goes red (within that 8 hour repeat time)
  *   There are no alerts generated even though the test has gone red

Thanks,
John
quoted from Josh Luthman

From: Josh Luthman [mailto:user-4c45a83f15cb@xymon.invalid]
Sent: Thursday, June 1, 2017 1:21 PM
To: Rothlisberger, John R. <user-7adce57665bb@xymon.invalid>
Cc: xymon at xymon.com
Subject: [External] Re: [Xymon] Alert transition yellow -> red with repeat problem

It's a known issue.  I don't believe I've ever seen any kind of resolution.


Josh Luthman
Office: XXX-XXX-XXXX
Direct: XXX-XXX-XXXX
XXXX Wayne St
Suite XXXX
Troy, OH XXXXX

On Thu, Jun 1, 2017 at 2:03 PM, <user-7adce57665bb@xymon.invalid<mailto:user-7adce57665bb@xymon.invalid>> wrote:
This problem continues and I can’t seem to get anybody’s attention.

Has anyone seen this problem before?????????

Thanks,
John
Upcoming PTO:
John Rothlisberger
IT Strategy, Infrastructure & Security - Technology Growth Platform
TGP for Business Process Outsourcing
Accenture

XXX.XXX.XXXX<tel:(312)%20693-3136> office
quoted from Josh Luthman

From: Rothlisberger, John R.
Sent: Friday, April 28, 2017 1:40 PM
To: 'xymon >> xymon at xymon.com<mailto:xymon at xymon.com>' <xymon at xymon.com<mailto:xymon at xymon.com>>
Subject: RE: Alert transition yellow -> red with repeat problem

This problem contributed to an outage last night – something is wrong.

Last night we had a disk that was in the warning state – warning email sent.

That disk then went into an alert state and an alert email was triggered (right away as the DURATION value took into account the time it was yellow) and then again 15 minutes later as designed.

THEN – another disk on that server went yellow.  It did NOT trigger any emails (as expected we should continue to focus on the alerts) but it somehow interfered with the REPEAT time of the alerts and those STOPPED.

There is a bug somewhere.

Thanks,
John
Upcoming PTO:
John Rothlisberger
IT Strategy, Infrastructure & Security - Technology Growth Platform
TGP for Business Process Outsourcing
Accenture

XXX.XXX.XXXX<tel:(312)%20693-3136> office
quoted from Josh Luthman

From: Rothlisberger, John R.
Sent: Tuesday, April 11, 2017 7:49 AM
To: 'xymon >> xymon at xymon.com<mailto:xymon at xymon.com>' <xymon at xymon.com<mailto:xymon at xymon.com>>
Subject: Alert transition yellow -> red with repeat problem

This is a problem I have seen for a long long time and have actually brought it up on the list before.
Xymon 4.3.21
Ubuntu 14.04LTS

The problem I have (in this instance) a warning which is to be repeated daily suddenly goes red and triggers a single alert but doesn’t repeat again until the repeat time of the warning has passed.

From the notification.log:
Tue Mar 28 05:31:28 2017 ServerA.disk (IP) disk_warn 1490697082 100 <- warning sets a repeat time of 1 day
Tue Mar 28 05:53:32 2017 ServerA.disk (IP) disk_alert 1490698405 100 <- minutes later it goes red (red repeat time is 15 minutes but no further alerts are generated)
Next alert comes out 1 day after the above warning:
Wed Mar 29 05:31:31 2017 ServerA.disk (IP) disk_alert 1490783486 100 <- 1 day after previous warning.  This should have been repeated every 15 minutes.
Wed Mar 29 05:46:40 2017 ServerA.disk (IP) disk_alert 1490784394 100 <- now, the repeat time is 15 minutes
… <- and continues every 15 minutes.

This alert went a full 24 hours with only a single notification.  ☹
I have seen this before (not always) where the repeat time in a warning overrides a follow up alert until the warning repeat time has expired.

Alert rules:
   SCRIPT /home/xymon/server/ext/pg/exwarn_SCRIPT disk_warn DURATION>30 REPEAT=1d COLOR=yellow SERVICE=disk FORMAT=TEXT UNMATCHED
   SCRIPT /home/xymon/server/ext/pg/exalert_SCRIPT disk_alert DURATION>20 REPEAT=15 COLOR=red SERVICE=disk FORMAT=TEXT UNMATCHED

Ideas/thoughts?


Thanks,
John
Upcoming PTO:  4/3
John Rothlisberger
IT Strategy, Infrastructure & Security - Technology Growth Platform
TGP for Business Process Outsourcing
Accenture

XXX.XXX.XXXX<tel:(312)%20693-3136> office
quoted from Josh Luthman


This message is for the designated recipient only and may contain privileged, proprietary, or otherwise confidential information. If you have received it in error, please notify the sender immediately and delete the original. Any other use of the e-mail by you is prohibited. Where allowed by local law, electronic communications with Accenture and its affiliates, including e-mail and instant messaging (including content), may be scanned by our systems for the purposes of information security and assessment of internal compliance with Accenture policy.
list Alan Ford · Tue, 21 Nov 2017 21:19:28 +0000 ·
Have you tried putting the red alert before the yellow?

Sent from my iPhone
quoted from John Rothlisberger

On 21 Nov 2017, at 11:18 pm, Rothlisberger, John R. <user-7adce57665bb@xymon.invalid<mailto:user-7adce57665bb@xymon.invalid>> wrote:

Is anyone looking to fix this issue?

It keeps popping up and folks are losing confidence in Xymon.  It’s hard not to when something goes from yellow->red and there are no alerts generated.

To recap:

  *   Test goes yellow
  *   Duration passes
  *   Warning sent
  *   Repeat time set to X (in my case 8 hours)
  *   Test goes red (within that 8 hour repeat time)
  *   There are no alerts generated even though the test has gone red

Thanks,
John

From: Josh Luthman [mailto:user-4c45a83f15cb@xymon.invalid]
Sent: Thursday, June 1, 2017 1:21 PM
To: Rothlisberger, John R. <user-7adce57665bb@xymon.invalid<mailto:user-7adce57665bb@xymon.invalid>>
Cc: xymon at xymon.com<mailto:xymon at xymon.com>
Subject: [External] Re: [Xymon] Alert transition yellow -> red with repeat problem

It's a known issue.  I don't believe I've ever seen any kind of resolution.


Josh Luthman
Office: XXX-XXX-XXXX
Direct: XXX-XXX-XXXX
XXXX Wayne St
Suite XXXX
Troy, OH XXXXX

On Thu, Jun 1, 2017 at 2:03 PM, <user-7adce57665bb@xymon.invalid<mailto:user-7adce57665bb@xymon.invalid>> wrote:
This problem continues and I can’t seem to get anybody’s attention.

Has anyone seen this problem before?????????

Thanks,
John
Upcoming PTO:
John Rothlisberger
IT Strategy, Infrastructure & Security - Technology Growth Platform
TGP for Business Process Outsourcing
Accenture
XXX.XXX.XXXX<tel:(312)%20693-3136> office

From: Rothlisberger, John R.
Sent: Friday, April 28, 2017 1:40 PM
To: 'xymon >> xymon at xymon.com<mailto:xymon at xymon.com>' <xymon at xymon.com<mailto:xymon at xymon.com>>
Subject: RE: Alert transition yellow -> red with repeat problem

This problem contributed to an outage last night – something is wrong.

Last night we had a disk that was in the warning state – warning email sent.

That disk then went into an alert state and an alert email was triggered (right away as the DURATION value took into account the time it was yellow) and then again 15 minutes later as designed.

THEN – another disk on that server went yellow.  It did NOT trigger any emails (as expected we should continue to focus on the alerts) but it somehow interfered with the REPEAT time of the alerts and those STOPPED.

There is a bug somewhere.

Thanks,
John
Upcoming PTO:
John Rothlisberger
IT Strategy, Infrastructure & Security - Technology Growth Platform
TGP for Business Process Outsourcing
Accenture
XXX.XXX.XXXX<tel:(312)%20693-3136> office

From: Rothlisberger, John R.
Sent: Tuesday, April 11, 2017 7:49 AM
To: 'xymon >> xymon at xymon.com<mailto:xymon at xymon.com>' <xymon at xymon.com<mailto:xymon at xymon.com>>
Subject: Alert transition yellow -> red with repeat problem

This is a problem I have seen for a long long time and have actually brought it up on the list before.
Xymon 4.3.21
Ubuntu 14.04LTS

The problem I have (in this instance) a warning which is to be repeated daily suddenly goes red and triggers a single alert but doesn’t repeat again until the repeat time of the warning has passed.

From the notification.log:
Tue Mar 28 05:31:28 2017 ServerA.disk (IP) disk_warn 1490697082 100 <- warning sets a repeat time of 1 day
Tue Mar 28 05:53:32 2017 ServerA.disk (IP) disk_alert 1490698405 100 <- minutes later it goes red (red repeat time is 15 minutes but no further alerts are generated)
Next alert comes out 1 day after the above warning:
Wed Mar 29 05:31:31 2017 ServerA.disk (IP) disk_alert 1490783486 100 <- 1 day after previous warning.  This should have been repeated every 15 minutes.
Wed Mar 29 05:46:40 2017 ServerA.disk (IP) disk_alert 1490784394 100 <- now, the repeat time is 15 minutes
… <- and continues every 15 minutes.

This alert went a full 24 hours with only a single notification.  :(
I have seen this before (not always) where the repeat time in a warning overrides a follow up alert until the warning repeat time has expired.

Alert rules:
   SCRIPT /home/xymon/server/ext/pg/exwarn_SCRIPT disk_warn DURATION>30 REPEAT=1d COLOR=yellow SERVICE=disk FORMAT=TEXT UNMATCHED
   SCRIPT /home/xymon/server/ext/pg/exalert_SCRIPT disk_alert DURATION>20 REPEAT=15 COLOR=red SERVICE=disk FORMAT=TEXT UNMATCHED

Ideas/thoughts?


Thanks,
John
Upcoming PTO:  4/3
John Rothlisberger
IT Strategy, Infrastructure & Security - Technology Growth Platform
TGP for Business Process Outsourcing
Accenture
XXX.XXX.XXXX<tel:(312)%20693-3136> office


This message is for the designated recipient only and may contain privileged, proprietary, or otherwise confidential information. If you have received it in error, please notify the sender immediately and delete the original. Any other use of the e-mail by you is prohibited. Where allowed by local law, electronic communications with Accenture and its affiliates, including e-mail and instant messaging (including content), may be scanned by our systems for the purposes of information security and assessment of internal compliance with Accenture policy.

www.accenture.com<http://www.accenture.com>;


Xymon at xymon.com<https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.xymon.com_mailman_listinfo_xymon&d=DwMFaQ&c=eIGjsITfXP_y-DLLX0uEHXJvU8nOHrUK8IrwNKOtkVU&r=u6KtIBCRNAeN-AbgJjdZe5zZJVFEfq04dnWD-hYNPL_fxJIIFncbL8W6k0NMJtuq&m=SAgqVg6kYnRT4gTqLDhZ--XB9LO_3wl6luywEPyLj8s&s=AjoIDplLoBGVFu8ZNCy1vTPQCtZph0ENnqRkCzcz3z4&e=>;


This email is to be read subject to the email disclaimer located at http://www.stanwell.com/site-information/email-disclaimer/
list John Rothlisberger · Mon, 27 Nov 2017 14:04:06 +0000 ·
The "red" alert rules do precede the "yellow" rules.

This problem has to do with the duration and it not being overridden when it goes from yellow to red.

Thanks,
John
quoted from Alan Ford

-----Original Message-----
From: Ford, Alan [mailto:user-eb925835b8b9@xymon.invalid] 
Sent: Tuesday, November 21, 2017 3:19 PM
To: Rothlisberger, John R. <user-7adce57665bb@xymon.invalid>
Cc: xymon >> xymon at xymon.com <xymon at xymon.com>
Subject: Re: [Xymon] [External] Re: Alert transition yellow -> red with repeat problem

Have you tried putting the red alert before the yellow?

Sent from my iPhone

On 21 Nov 2017, at 11:18 pm, Rothlisberger, John R. <user-7adce57665bb@xymon.invalid<mailto:user-7adce57665bb@xymon.invalid>> wrote:

Is anyone looking to fix this issue?

It keeps popping up and folks are losing confidence in Xymon.  ItÂ's hard not to when something goes from yellow->red and there are no alerts generated.
quoted from Alan Ford

To recap:

  *   Test goes yellow
  *   Duration passes
  *   Warning sent
  *   Repeat time set to X (in my case 8 hours)
  *   Test goes red (within that 8 hour repeat time)
  *   There are no alerts generated even though the test has gone red

Thanks,
John

From: Josh Luthman [mailto:user-4c45a83f15cb@xymon.invalid]
Sent: Thursday, June 1, 2017 1:21 PM
To: Rothlisberger, John R. <user-7adce57665bb@xymon.invalid<mailto:user-7adce57665bb@xymon.invalid>>
Cc: xymon at xymon.com<mailto:xymon at xymon.com>
Subject: [External] Re: [Xymon] Alert transition yellow -> red with repeat problem

It's a known issue.  I don't believe I've ever seen any kind of resolution.


Josh Luthman
Office: XXX-XXX-XXXX
Direct: XXX-XXX-XXXX
XXXX Wayne St
Suite XXXX
Troy, OH XXXXX

On Thu, Jun 1, 2017 at 2:03 PM, <user-7adce57665bb@xymon.invalid<mailto:user-7adce57665bb@xymon.invalid>> wrote:

This problem continues and I canÂ't seem to get anybodyÂ's attention.
signature

Has anyone seen this problem before?????????

Thanks,
John
Upcoming PTO:
John Rothlisberger
IT Strategy, Infrastructure & Security - Technology Growth Platform TGP for Business Process Outsourcing Accenture XXX.XXX.XXXX<tel:(312)%20693-3136> office _____________________________________________________________________

quoted from Alan Ford
From: Rothlisberger, John R.
Sent: Friday, April 28, 2017 1:40 PM
To: 'xymon >> xymon at xymon.com<mailto:xymon at xymon.com>' <xymon at xymon.com<mailto:xymon at xymon.com>>
Subject: RE: Alert transition yellow -> red with repeat problem

This problem contributed to an outage last night Â- something is wrong.

Last night we had a disk that was in the warning state Â- warning email sent.

That disk then went into an alert state and an alert email was triggered (right away as the DURATION value took into account the time it was yellow) and then again 15 minutes later as designed.

THEN Â- another disk on that server went yellow.  It did NOT trigger any emails (as expected we should continue to focus on the alerts) but it somehow interfered with the REPEAT time of the alerts and those STOPPED.
signature

There is a bug somewhere.

Thanks,
John
Upcoming PTO:
John Rothlisberger
IT Strategy, Infrastructure & Security - Technology Growth Platform TGP for Business Process Outsourcing Accenture XXX.XXX.XXXX<tel:(312)%20693-3136> office _____________________________________________________________________

quoted from Alan Ford
From: Rothlisberger, John R.
Sent: Tuesday, April 11, 2017 7:49 AM
To: 'xymon >> xymon at xymon.com<mailto:xymon at xymon.com>' <xymon at xymon.com<mailto:xymon at xymon.com>>
Subject: Alert transition yellow -> red with repeat problem

This is a problem I have seen for a long long time and have actually brought it up on the list before.
Xymon 4.3.21
Ubuntu 14.04LTS

The problem I have (in this instance) a warning which is to be repeated daily suddenly goes red and triggers a single alert but doesnÂ't repeat again until the repeat time of the warning has passed.
quoted from Alan Ford

From the notification.log:
Tue Mar 28 05:31:28 2017 ServerA.disk (IP) disk_warn 1490697082 100 <- warning sets a repeat time of 1 day Tue Mar 28 05:53:32 2017 ServerA.disk (IP) disk_alert 1490698405 100 <- minutes later it goes red (red repeat time is 15 minutes but no further alerts are generated) Next alert comes out 1 day after the above warning:
Wed Mar 29 05:31:31 2017 ServerA.disk (IP) disk_alert 1490783486 100 <- 1 day after previous warning.  This should have been repeated every 15 minutes.

Wed Mar 29 05:46:40 2017 ServerA.disk (IP) disk_alert 1490784394 100 <- now, the repeat time is 15 minutes Â. <- and continues every 15 minutes.
quoted from Alan Ford

This alert went a full 24 hours with only a single notification.  :( I have seen this before (not always) where the repeat time in a warning overrides a follow up alert until the warning repeat time has expired.

Alert rules:
   SCRIPT /home/xymon/server/ext/pg/exwarn_SCRIPT disk_warn DURATION>30 REPEAT=1d COLOR=yellow SERVICE=disk FORMAT=TEXT UNMATCHED
   SCRIPT /home/xymon/server/ext/pg/exalert_SCRIPT disk_alert DURATION>20 REPEAT=15 COLOR=red SERVICE=disk FORMAT=TEXT UNMATCHED

Ideas/thoughts?


Thanks,
John
Upcoming PTO:  4/3
John Rothlisberger
IT Strategy, Infrastructure & Security - Technology Growth Platform TGP for Business Process Outsourcing Accenture XXX.XXX.XXXX<tel:(312)%20693-3136> office _____________________________________________________________________


This message is for the designated recipient only and may contain privileged, proprietary, or otherwise confidential information. If you have received it in error, please notify the sender immediately and delete the original. Any other use of the e-mail by you is prohibited. Where allowed by local law, electronic communications with Accenture and its affiliates, including e-mail and instant messaging (including content), may be scanned by our systems for the purposes of information security and assessment of internal compliance with Accenture policy.

www.accenture.com<http://www.accenture.com>;


Xymon at xymon.com<mailto:Xymon at xymon.com>
https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.xymon.com_mailman_listinfo_xymon&d=DwIF-g&c=eIGjsITfXP_y-DLLX0uEHXJvU8nOHrUK8IrwNKOtkVU&r=u6KtIBCRNAeN-AbgJjdZe5zZJVFEfq04dnWD-hYNPL_fxJIIFncbL8W6k0NMJtuq&m=zr9quBrMtyQNZ1Zgpy-rW0PqSKW8B-aJBCjfd3JNY_E&s=tVU96yWQSoNR7NZ2J5qR2Dty7hgihUfdKcwm-l9s6KE&e=<https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.xymon.com_mailman_listinfo_xymon&d=DwMFaQ&c=eIGjsITfXP_y-DLLX0uEHXJvU8nOHrUK8IrwNKOtkVU&r=u6KtIBCRNAeN-AbgJjdZe5zZJVFEfq04dnWD-hYNPL_fxJIIFncbL8W6k0NMJtuq&m=SAgqVg6kYnRT4gTqLDhZ--XB9LO_3wl6luywEPyLj8s&s=AjoIDplLoBGVFu8ZNCy1vTPQCtZph0ENnqRkCzcz3z4&e=>;


Xymon at xymon.com<mailto:Xymon at xymon.com>
https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.xymon.com_mailman_listinfo_xymon&d=DwIF-g&c=eIGjsITfXP_y-DLLX0uEHXJvU8nOHrUK8IrwNKOtkVU&r=u6KtIBCRNAeN-AbgJjdZe5zZJVFEfq04dnWD-hYNPL_fxJIIFncbL8W6k0NMJtuq&m=zr9quBrMtyQNZ1Zgpy-rW0PqSKW8B-aJBCjfd3JNY_E&s=tVU96yWQSoNR7NZ2J5qR2Dty7hgihUfdKcwm-l9s6KE&e=

This email is to be read subject to the email disclaimer located at https://urldefense.proofpoint.com/v2/url?u=http-3A__www.stanwell.com_site-2Dinformation_email-2Ddisclaimer_&d=DwIF-g&c=eIGjsITfXP_y-DLLX0uEHXJvU8nOHrUK8IrwNKOtkVU&r=u6KtIBCRNAeN-AbgJjdZe5zZJVFEfq04dnWD-hYNPL_fxJIIFncbL8W6k0NMJtuq&m=zr9quBrMtyQNZ1Zgpy-rW0PqSKW8B-aJBCjfd3JNY_E&s=d7P_iyGszWLWJzsh88Hqg4IHWKO7pbZtCGNvYdatDwY&e=