Xymon Mailing List Archive search

system reboot email alert

9 messages in this thread

list Sue Bauer-Lee · Thu, 10 Jul 2014 15:24:14 +0000 ·
Hello All!

Times have changed a bit in the infrastructure environment such that many have a large contingent of virtual machines and blades. These hosts tend to reboot rather quickly, hence connectivity failures tend to not get noticed. I have a request on the table to send an email alert with a specific subject line to indicate that a host has rebooted.

I've seen a few posts over the years regarding requests to alert for system reboots. I also saw a post or two about adding a dynamic column capability. Some responses suggested an external script to accomplish the task of emailing an alert.

I'm asking the question again because as I peruse the analysis.cfg and review the rules, I know I can issue an email alert for PROC and DISK.
That begs the question of 'why can't I issue an email alert for UP'? I'm fine with the 'yellow' on CPU test for recent reboot, but since it also includes load changes, I'm not interested in a generic email for either one when they occur.

If I already can, how??

And if not, suggestions for a simple approach? I'm not exactly getting the desired results server-wide with an external script that should just send an email for a CPU status color=yellow without interfering with our other configured alerts. :(

Is someone already successfully conquering this task?


NOTICE OF CONFIDENTIALITY: This message and any attachments contains confidential information belonging to the sender intended only for the use of the individual or entity named above. If you are not the intended recipient, be advised that copying, disclosure or reliance upon the contents is strictly prohibited. If you have received this message in error please notify the sender immediately.


This email has been scanned by the Symantec Email Security.cloud service.
For more information please visit http://www.symanteccloud.com
list Galen Johnson · Thu, 10 Jul 2014 18:39:42 +0000 ·
Yes...but not with Xymon.  I have a script that runs on startup that sends an email to alert me that the server rebooted with instructions for what to do (if needed).  We used to have some services that required user specific ssh-agents to be running so this was written to let those teams know to log back in and restart the agents.  Basically (from memory):

#!/bin/sh

###
# chkconfig: 35 88
# description: Reboot message
###

SENDMAIL="/usr/sbin/sendmail"
HOST=`hostname`
NOW=`date`
MAILTO="user-428942ae1d07@xymon.invalid user-df6392666e93@xymon.invalid"

cat <<EOF | $SENDMAIL $MAILTO
Subject: $HOST Rebooted!
To: $MAILTO
Content-Type: text/plain

ATTENTION!  Reboot of $HOST has taken place.

EOF

I'm sure there are better ways and this could be more fleshed out but you will know immediately that a machine restarted.

=G=
quoted from Sue Bauer-Lee

From: Xymon <xymon-bounces at xymon.com> on behalf of Bauer-Lee, Sue <user-84011281759f@xymon.invalid>
Sent: Thursday, July 10, 2014 11:24 AM
To: xymon at xymon.com
Subject: [Xymon] system reboot email alert

Hello All!

Times have changed a bit in the infrastructure environment such that many have a large contingent of virtual machines and blades. These hosts tend to reboot rather quickly, hence connectivity failures tend to not get noticed. I have a request on the table to send an email alert with a specific subject line to indicate that a host has rebooted.

I've seen a few posts over the years regarding requests to alert for system reboots. I also saw a post or two about adding a dynamic column capability. Some responses suggested an external script to accomplish the task of emailing an alert.

I'm asking the question again because as I peruse the analysis.cfg and review the rules, I know I can issue an email alert for PROC and DISK.
That begs the question of 'why can't I issue an email alert for UP'? I'm fine with the 'yellow' on CPU test for recent reboot, but since it also includes load changes, I'm not interested in a generic email for either one when they occur.

If I already can, how??

And if not, suggestions for a simple approach? I'm not exactly getting the desired results server-wide with an external script that should just send an email for a CPU status color=yellow without interfering with our other configured alerts. :(

Is someone already successfully conquering this task?


This email has been scanned by the Symantec Email Security.cloud service.
For more information please visit http://www.symanteccloud.com

NOTICE OF CONFIDENTIALITY: This message and any attachments contains confidential information belonging to the sender intended only for the use of the individual or entity named above. If you are not the intended recipient, be advised that copying, disclosure or reliance upon the contents is strictly prohibited. If you have received this message in error please notify the sender immediately.
list Sue Bauer-Lee · Fri, 11 Jul 2014 16:59:41 +0000 ·
Thanks so much. That kind of script I can write and will certainly accomplish the email alert issue.


I'd much prefer to use xymon: it knows the host rebooted so my question was directed at why we can't  isolate that alert to report it via email/sms/whatever outside of JUST a column color change in the web interface .....
quoted from Galen Johnson


From: Galen Johnson [mailto:user-87f955643e3d@xymon.invalid]
Sent: Thursday, July 10, 2014 1:40 PM
To: Bauer-Lee, Sue; xymon at xymon.com
Subject: RE: system reboot email alert

Yes...but not with Xymon.  I have a script that runs on startup that sends an email to alert me that the server rebooted with instructions for what to do (if needed).  We used to have some services that required user specific ssh-agents to be running so this was written to let those teams know to log back in and restart the agents.  Basically (from memory):

#!/bin/sh

###
# chkconfig: 35 88
# description: Reboot message
###

SENDMAIL="/usr/sbin/sendmail"
HOST=`hostname`
NOW=`date`

MAILTO="user-428942ae1d07@xymon.invalid user-df6392666e93@xymon.invalid<mailto:user-428942ae1d07@xymon.invaliduser-87fd422525e4@xymon.invalid>"
quoted from Galen Johnson

cat <<EOF | $SENDMAIL $MAILTO
Subject: $HOST Rebooted!
To: $MAILTO
Content-Type: text/plain

ATTENTION!  Reboot of $HOST has taken place.

EOF

I'm sure there are better ways and this could be more fleshed out but you will know immediately that a machine restarted.

=G=

From: Xymon <xymon-bounces at xymon.com<mailto:xymon-bounces at xymon.com>> on behalf of Bauer-Lee, Sue <user-84011281759f@xymon.invalid<mailto:user-84011281759f@xymon.invalid>>
Sent: Thursday, July 10, 2014 11:24 AM
To: xymon at xymon.com<mailto:xymon at xymon.com>
Subject: [Xymon] system reboot email alert

Hello All!

Times have changed a bit in the infrastructure environment such that many have a large contingent of virtual machines and blades. These hosts tend to reboot rather quickly, hence connectivity failures tend to not get noticed. I have a request on the table to send an email alert with a specific subject line to indicate that a host has rebooted.

I've seen a few posts over the years regarding requests to alert for system reboots. I also saw a post or two about adding a dynamic column capability. Some responses suggested an external script to accomplish the task of emailing an alert.

I'm asking the question again because as I peruse the analysis.cfg and review the rules, I know I can issue an email alert for PROC and DISK.
That begs the question of 'why can't I issue an email alert for UP'? I'm fine with the 'yellow' on CPU test for recent reboot, but since it also includes load changes, I'm not interested in a generic email for either one when they occur.

If I already can, how??

And if not, suggestions for a simple approach? I'm not exactly getting the desired results server-wide with an external script that should just send an email for a CPU status color=yellow without interfering with our other configured alerts. :(

Is someone already successfully conquering this task?


This email has been scanned by the Symantec Email Security.cloud service.
For more information please visit http://www.symanteccloud.com
NOTICE OF CONFIDENTIALITY: This message and any attachments contains confidential information belonging to the sender intended only for the use of the individual or entity named above. If you are not the intended recipient, be advised that copying, disclosure or reliance upon the contents is strictly prohibited. If you have received this message in error please notify the sender immediately.

This email has been scanned by the Symantec Email Security.cloud service.
For more information please visit http://www.symanteccloud.com


NOTICE OF CONFIDENTIALITY: This message and any attachments contains confidential information belonging to the sender intended only for the use of the individual or entity named above. If you are not the intended recipient, be advised that copying, disclosure or reliance upon the contents is strictly prohibited. If you have received this message in error please notify the sender immediately.


This email has been scanned by the Symantec Email Security.cloud service.
For more information please visit http://www.symanteccloud.com
list Asif Iqbal · Fri, 11 Jul 2014 14:20:50 -0400 ·
On Thu, Jul 10, 2014 at 11:24 AM, Bauer-Lee, Sue <
quoted from Sue Bauer-Lee
user-84011281759f@xymon.invalid> wrote:
And if not, suggestions for a simple approach? I’m not exactly getting the
desired results server-wide with an external script that should just send
an email for a CPU status color=yellow without interfering with our other

configured alerts. L
You could right a server side extension script based on the [uptime]
section of clientdata


-- 
Asif Iqbal
PGP Key: 0xE62693C5 KeyServer: pgp.mit.edu
A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?
list Asif Iqbal · Fri, 11 Jul 2014 15:30:34 -0400 ·
quoted from Asif Iqbal
On Fri, Jul 11, 2014 at 2:20 PM, Asif Iqbal <user-6f4b51ac2a40@xymon.invalid> wrote:
You could right a server side extension script based on the [uptime]
section of clientdata

You could write..
quoted from Asif Iqbal


-- 
Asif Iqbal
PGP Key: 0xE62693C5 KeyServer: pgp.mit.edu
A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?
list Japheth Cleaver · Fri, 11 Jul 2014 13:14:25 -0700 ·
quoted from Asif Iqbal
On Fri, July 11, 2014 11:20 am, Asif Iqbal wrote:
On Thu, Jul 10, 2014 at 11:24 AM, Bauer-Lee, Sue <
user-84011281759f@xymon.invalid> wrote:
And if not, suggestions for a simple approach? I’m not exactly getting
quoted from Asif Iqbal
the
desired results server-wide with an external script that should just
send
an email for a CPU status color=yellow without interfering with our
other
configured alerts. L
You could right a server side extension script based on the [uptime]
section of clientdata

Yeah, the state model does tend to break down when trying to interpret
one-time events like this (monitoring systems tend to fall into one camp
or the other philosophically).

Getting the data in is easy enough: either parsing uptime, or cat-ing in
/proc/uptime to get it in a single value, the question is whether it's
especially useful to "waste" an entire status column in memory just for
this.

The one spot in xymon that does work around event-based tracking is the
log file parser or 'msgs' test, which converts a point-in-time event into
a 6x(runtime)-long stateful alert. As a quick hack, you could look for
strings that occur in the log file only on a (normal) bootup and trigger
those as a critical alert:

Something like...

analysis.cfg:
  LOG /var/log/messages "kernel:  bootmap" COLOR=red GROUP=hostrestart
  LOG /var/log/messages "%(?-i)syslog.+start" COLOR=red GROUP=hostrestart

alerts.cfg
SERVICE=msgs COLOR=red GROUP=hostrestart
  MAIL user-dfde56b39b09@xymon.invalid


This is... totally untested, but I think it should work.

Longer-term, yeah it would be nice to have a single "fake" status column
that would accept non-stateful event alerts (or modify's) and passes them
through. It might make integration with other alerting systems easier.


HTH,
-jc
list Asif Iqbal · Fri, 11 Jul 2014 16:18:05 -0400 ·
On Fri, Jul 11, 2014 at 4:14 PM, J.C. Cleaver <user-87556346d4af@xymon.invalid>
quoted from Japheth Cleaver
wrote:
Getting the data in is easy enough: either parsing uptime, or cat-ing in
/proc/uptime to get it in a single value, the question is whether it's
especially useful to "waste" an entire status column in memory just for
this.
[uptime] is already in the clientdata as part of default install.
quoted from Asif Iqbal


-- 
Asif Iqbal
PGP Key: 0xE62693C5 KeyServer: pgp.mit.edu
A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?
list Robert Herron · Sat, 12 Jul 2014 09:16:30 -0400 ·
Check out the UP parameter/setting in the analysis.cfg.  (
https://www.xymon.com/help/manpages/man5/analysis.cfg.5.html):

*UP bootlimit toolonglimit [color]*

The cpu status goes yellow/red if the system has been up for less than
"bootlimit" time, or longer than "toolonglimit". The time is in minutes, or
you can add h/d/w for hours/days/weeks - eg. "2h" for two hours, or "4w"
for 4 weeks.

Defaults: bootlimit=1h, toolonglimit=-1 (infinite), color=yellow.


So, you could add "UP 30m -1 RED" to either the DEFAULT stanza or to select
hosts.  The CPU test will then have a "&red Machine recently rebooted"  at
the top (similar to this
<https://www.xymon.com/xymon-cgi/historylog.sh?HOST=brahms.hswn.dk&SERVICE=cpu&TIMEBUF=Mon_Jul_7_22:59:08_2014>;)
when the host's uptime <= 30m.


Robert Herron
user-8b27ea4290da@xymon.invalid
quoted from Sue Bauer-Lee


On Thu, Jul 10, 2014 at 11:24 AM, Bauer-Lee, Sue <
user-84011281759f@xymon.invalid> wrote:
 Hello All!


Times have changed a bit in the infrastructure environment such that many
have a large contingent of virtual machines and blades. These hosts tend to
reboot rather quickly, hence connectivity failures tend to not get noticed.
I have a request on the table to send an email alert with a specific
subject line to indicate that a host has rebooted.


I’ve seen a few posts over the years regarding requests to alert for
system reboots. I also saw a post or two about adding a dynamic column
capability. Some responses suggested an external script to accomplish the
task of emailing an alert.


I’m asking the question again because as I peruse the analysis.cfg and
review the rules, I know I can issue an email alert for PROC and DISK.

That begs the question of ‘why can’t I issue an email alert for UP’? I’m
fine with the ‘yellow’ on CPU test for recent reboot, but since it also
includes load changes, I’m not interested in a generic email for either one
when they occur.


If I already can, how??


And if not, suggestions for a simple approach? I’m not exactly getting the
desired results server-wide with an external script that should just send
an email for a CPU status color=yellow without interfering with our other
configured alerts. L


Is someone already successfully conquering this task?


This email has been scanned by the Symantec Email Security.cloud service.
For more information please visit http://www.symanteccloud.com

*NOTICE OF CONFIDENTIALITY: This message and any attachments contains
confidential information belonging to the sender intended only for the use
of the individual or entity named above. If you are not the intended
recipient, be advised that copying, disclosure or reliance upon the
contents is strictly prohibited. If you have received this message in error
please notify the sender immediately.  *

list Sue Bauer-Lee · Sat, 12 Jul 2014 16:40:56 +0000 ·
And the entry for alerts.cfg to send an email notification?
quoted from Robert Herron

From: Robert Herron [mailto:user-8b27ea4290da@xymon.invalid]
Sent: Saturday, July 12, 2014 8:17 AM
To: Bauer-Lee, Sue
Cc: xymon at xymon.com
Subject: Re: [Xymon] system reboot email alert

Check out the UP parameter/setting in the analysis.cfg.  (https://www.xymon.com/help/manpages/man5/analysis.cfg.5.html):

UP bootlimit toolonglimit [color]

The cpu status goes yellow/red if the system has been up for less than "bootlimit" time, or longer than "toolonglimit". The time is in minutes, or you can add h/d/w for hours/days/weeks - eg. "2h" for two hours, or "4w" for 4 weeks.

Defaults: bootlimit=1h, toolonglimit=-1 (infinite), color=yellow.


So, you could add "UP 30m -1 RED" to either the DEFAULT stanza or to select hosts.  The CPU test will then have a "&red Machine recently rebooted"  at the top (similar to this<https://www.xymon.com/xymon-cgi/historylog.sh?HOST=brahms.hswn.dk&SERVICE=cpu&TIMEBUF=Mon_Jul_7_22:59:08_2014>;) when the host's uptime <= 30m.


Robert Herron
user-8b27ea4290da@xymon.invalid<mailto:user-8b27ea4290da@xymon.invalid>
quoted from Robert Herron

On Thu, Jul 10, 2014 at 11:24 AM, Bauer-Lee, Sue <user-84011281759f@xymon.invalid<mailto:user-84011281759f@xymon.invalid>> wrote:
Hello All!

Times have changed a bit in the infrastructure environment such that many have a large contingent of virtual machines and blades. These hosts tend to reboot rather quickly, hence connectivity failures tend to not get noticed. I have a request on the table to send an email alert with a specific subject line to indicate that a host has rebooted.

I’ve seen a few posts over the years regarding requests to alert for system reboots. I also saw a post or two about adding a dynamic column capability. Some responses suggested an external script to accomplish the task of emailing an alert.

I’m asking the question again because as I peruse the analysis.cfg and review the rules, I know I can issue an email alert for PROC and DISK.
That begs the question of ‘why can’t I issue an email alert for UP’? I’m fine with the ‘yellow’ on CPU test for recent reboot, but since it also includes load changes, I’m not interested in a generic email for either one when they occur.

If I already can, how??

And if not, suggestions for a simple approach? I’m not exactly getting the desired results server-wide with an external script that should just send an email for a CPU status color=yellow without interfering with our other configured alerts. ☹

Is someone already successfully conquering this task?


This email has been scanned by the Symantec Email Security.cloud service.
For more information please visit http://www.symanteccloud.com

NOTICE OF CONFIDENTIALITY: This message and any attachments contains confidential information belonging to the sender intended only for the use of the individual or entity named above. If you are not the intended recipient, be advised that copying, disclosure or reliance upon the contents is strictly prohibited. If you have received this message in error please notify the sender immediately.


This email has been scanned by the Symantec Email Security.cloud service.
For more information please visit http://www.symanteccloud.com


NOTICE OF CONFIDENTIALITY: This message and any attachments contains confidential information belonging to the sender intended only for the use of the individual or entity named above. If you are not the intended recipient, be advised that copying, disclosure or reliance upon the contents is strictly prohibited. If you have received this message in error please notify the sender immediately.


This email has been scanned by the Symantec Email Security.cloud service.
For more information please visit http://www.symanteccloud.com