troubleshooting missing alert
list Michael Baydoun
Had an event last night that caused a test to go red for 20 minutes, but no
email was sent.
All other events before and after sent alerts as expected
Verified via the info dot that xymon believes it should alert for this
event.
There is nothing in the maillog for this event, it appears xymon did not
send anything.
Where else can I look to troubleshoot?
In case you want to see for yourself
from bb-hosts
{code}
page SLA SLA Checks
0.0.0.0 <url redacted> # noconn cont;https://<url
redacted>/;experience
{code}
from hobbit-alerts.cfg (will have to trust me that there are no matching
exceptions above this entry in the file)
{code}
PAGE=SLA DURATION>1
IGNORE SERVICE=sslcert
MAIL <address redacted> RECOVERED REPEAT=365d COLOR=yellow,purple
MAIL <address redacted> RECOVERED REPEAT=30m COLOR=red
{code}
from the info dot
{code}
Alerting:ServiceRecipient1st DelayStop afterRepeatTime of
DayColorscontent<address
redacted>(R)1m-52w 1d-purple,yellow<address redacted> (R)1m-30m-red
{code}
list Jamison Maxwell
The first place to look would be in however you track messages. For instance, I use sendmail to transport messages from the server Xymon runs on to other MTA's in my organization. I'm assuming you're running on Linux, so If you're using sendmail, I'd check in /var/log/maillog to see if the message was submitted from Xymon and where it went from there.... Jamison Maxwell
▸
From: xymon-bounces at xymon.com [mailto:xymon-bounces at xymon.com] On Behalf Of Michael Baydoun Sent: Tuesday, March 06, 2012 9:36 AM To: xymon at xymon.com Subject: [Xymon] troubleshooting missing alert Had an event last night that caused a test to go red for 20 minutes, but no email was sent. All other events before and after sent alerts as expected Verified via the info dot that xymon believes it should alert for this event. There is nothing in the maillog for this event, it appears xymon did not send anything. Where else can I look to troubleshoot? In case you want to see for yourself from bb-hosts {code} page SLA SLA Checks 0.0.0.0 <url redacted> # noconn cont;https://<url redacted>/;experience {code} from hobbit-alerts.cfg (will have to trust me that there are no matching exceptions above this entry in the file) {code} PAGE=SLA DURATION>1 IGNORE SERVICE=sslcert MAIL <address redacted> RECOVERED REPEAT=365d COLOR=yellow,purple MAIL <address redacted> RECOVERED REPEAT=30m COLOR=red {code} from the info dot {code}
Alerting:
Service
Recipient
1st Delay
Stop after
Repeat
Time of Day
Colors
content
<address redacted>(R)
1m
• 52w 1d
• purple,yellow
<address redacted> (R)
1m
• 30m
• red
{code}
list Jamison Maxwell
Ooh, re-read you original message. Sorry, you've already checked there.... ...I'll shutup now.
▸
From: xymon-bounces at xymon.com [mailto:xymon-bounces at xymon.com] On Behalf Of Jamison Maxwell Sent: Tuesday, March 06, 2012 6:44 PM To: xymon at xymon.com Subject: Re: [Xymon] troubleshooting missing alert The first place to look would be in however you track messages. For instance, I use sendmail to transport messages from the server Xymon runs on to other MTA's in my organization. I'm assuming you're running on Linux, so If you're using sendmail, I'd check in /var/log/maillog to see if the message was submitted from Xymon and where it went from there.... Jamison Maxwell From: xymon-bounces at xymon.com<mailto:xymon-bounces at xymon.com> [mailto:xymon-bounces at xymon.com]<mailto:[mailto:xymon-bounces at xymon.com]> On Behalf Of Michael Baydoun Sent: Tuesday, March 06, 2012 9:36 AM To: xymon at xymon.com<mailto:xymon at xymon.com> Subject: [Xymon] troubleshooting missing alert Had an event last night that caused a test to go red for 20 minutes, but no email was sent. All other events before and after sent alerts as expected Verified via the info dot that xymon believes it should alert for this event. There is nothing in the maillog for this event, it appears xymon did not send anything. Where else can I look to troubleshoot? In case you want to see for yourself from bb-hosts {code} page SLA SLA Checks 0.0.0.0 <url redacted> # noconn cont;https://<url redacted>/;experience {code} from hobbit-alerts.cfg (will have to trust me that there are no matching exceptions above this entry in the file) {code} PAGE=SLA DURATION>1 IGNORE SERVICE=sslcert MAIL <address redacted> RECOVERED REPEAT=365d COLOR=yellow,purple MAIL <address redacted> RECOVERED REPEAT=30m COLOR=red {code} from the info dot {code} Alerting: Service Recipient 1st Delay Stop after Repeat Time of Day Colors content <address redacted>(R) 1m • 52w 1d • purple,yellow <address redacted> (R) 1m • 30m • red {code}
list Jeremy Laidman
▸
On Wed, Mar 7, 2012 at 1:35 AM, Michael Baydoun <user-105c655da53d@xymon.invalid>wrote:
Had an event last night that caused a test to go red for 20 minutes, but no email was sent.
There is nothing in the maillog for this event, it appears xymon did not
send anything.
Has this worked recently? Perhaps the link between alerting an your MTA has been severed. Where else can I look to troubleshoot? Perhaps look for errors from xymond_alert. These will be in "alert.log". You might find a reason why it can't run /usr/bin/mail. Also, try "xymond_alert --dump-config" and check the parsed config to see if it matches what you expect. J