Xymon Mailing List Archive search

Is there a way to "quietly" disable hosts that have NOTICE set?

14 messages in this thread

list Charles Jones · Sun, 24 Sep 2006 22:56:15 -0700 ·
I have the NOTICE flag set for all of my production hosts - I want pages to go out if someone disables or enables any of them.

However, when backups are done, the oracle databases are brought down, which triggers an alert. If they are manually disabled, the NOTICE message goes out which also wakes people up for no reason.

If I use DOWNTIME in bb-hosts, then I have to specify a window which is guaranteed to be longer than the possible time it could take to backup the databases (which is a dynamic thing which will surely be wrong from time to time). So what ends up happening is for example, I would specify an hour of DOWNTIME, but the backups sometimes only take 30 minutes. That means there is a 30 minute window where a real alert would be masked, which is unacceptable in a production environment.

I guess what I'm looking for, is a way that I can send a commands to Hobbit via a shellscript (called from the db backup script), that would put a host/services in maint mode (disabled - blue dot), and NOT send a NOTICE page.

-Charles
list Francesco Duranti · Mon, 25 Sep 2006 09:40:13 +0200 ·
Directly from the man bb command :D


	 disable HOSTNAME.TESTNAME DURATION <additional text>
              Disables  a  specific  test  for  DURATION minutes. This
will cause the status of this test to be listed as
              "blue" on the BBDISPLAY server, and no alerts for this
host/test will be generated. If DURATION is given as
              a  number  followed by s/m/h/d, it is interpreted as being
in seconds/minutes/hours/days respectively.   To
              disable all tests for a host, use an asterisk "*" for
TESTNAME.

       enable HOSTNAME.TESTNAME
              Re-enables a test that had been disabled. 

So you can execute bb hobbitserver "disable dbservername.* 2h Backup
time" before shutting down the database and bb hobbitserver "enable
dbserver.*" just at the end.

Francesco
quoted from Charles Jones

-----Original Message-----
From: Charles Jones [mailto:user-e86b4aeade4e@xymon.invalid] Sent: Monday, September 25, 2006 7:56 AM
To: user-ae9b8668bcde@xymon.invalid
Subject: [hobbit] Is there a way to "quietly" disable hosts that have NOTICE set?

I have the NOTICE flag set for all of my production hosts - I want pages to go out if someone disables or enables any of them.

However, when backups are done, the oracle databases are brought down, which triggers an alert. If they are manually disabled, the NOTICE message goes out which also wakes people up for no reason.

If I use DOWNTIME in bb-hosts, then I have to specify a window which is guaranteed to be longer than the possible time it could take to backup the databases (which is a dynamic thing which will surely be wrong from time to time). So what ends up happening is for example, I would specify an hour of DOWNTIME, but the backups sometimes only take 30 minutes. That means there is a 30 minute window where a real alert would be masked, which is unacceptable in a production environment.

I guess what I'm looking for, is a way that I can send a commands to Hobbit via a shellscript (called from the db backup script), that would put a host/services in maint mode (disabled - blue dot), and NOT send a NOTICE page.

-Charles

list Charles Jones · Mon, 25 Sep 2006 05:12:16 -0700 ·
That sends out NOTICE alerts though. If you are on call, do you want to get paged at 3am every night when those services are put into MAINT mode? Thats why I asked if there was a way to do a "silent" disable. I basically need something that works exactly like the DOWNTIME option in bb-hosts, except interactively.

-Charles
quoted from Francesco Duranti

Francesco Duranti wrote:
Directly from the man bb command :D


	 disable HOSTNAME.TESTNAME DURATION <additional text>
              Disables  a  specific  test  for  DURATION minutes. This
will cause the status of this test to be listed as
              "blue" on the BBDISPLAY server, and no alerts for this
host/test will be generated. If DURATION is given as
              a  number  followed by s/m/h/d, it is interpreted as being
in seconds/minutes/hours/days respectively.   To
              disable all tests for a host, use an asterisk "*" for
TESTNAME.

       enable HOSTNAME.TESTNAME
              Re-enables a test that had been disabled. 

So you can execute bb hobbitserver "disable dbservername.* 2h Backup
time" before shutting down the database and bb hobbitserver "enable
dbserver.*" just at the end.

Francesco

-----Original Message-----
From: Charles Jones [mailto:user-e86b4aeade4e@xymon.invalid] Sent: Monday, September 25, 2006 7:56 AM
To: user-ae9b8668bcde@xymon.invalid
Subject: [hobbit] Is there a way to "quietly" disable hosts that have NOTICE set?

I have the NOTICE flag set for all of my production hosts - I want pages to go out if someone disables or enables any of them.

However, when backups are done, the oracle databases are brought down, which triggers an alert. If they are manually disabled, the NOTICE message goes out which also wakes people up for no reason.

If I use DOWNTIME in bb-hosts, then I have to specify a window which is guaranteed to be longer than the possible time it could take to backup the databases (which is a dynamic thing which will surely be wrong from time to time). So what ends up happening is for example, I would specify an hour of DOWNTIME, but the backups sometimes only take 30 minutes. That means there is a 30 minute window where a real alert would be masked, which is unacceptable in a production environment.

I guess what I'm looking for, is a way that I can send a commands to Hobbit via a shellscript (called from the db backup script), that would put a host/services in maint mode (disabled - blue dot), and NOT send a NOTICE page.

-Charles

list Adam Scheblein · Mon, 25 Sep 2006 07:49:03 -0500 ·
we have put in the following because every night we have batch jobs that run that use almost all the processor power in our server:

SERVICE=cpu
MAIL [e-mail address] COLOR=red DURATION=15 TIME=*:0900:1700 HOST=[hostname]

by changing the TIME field to suit you, you will not get paiged, you just have to put an EXHOST=[hostname] in your normal alert rules.

Adam
quoted from Charles Jones

From: Charles Jones
Sent: Monday, September 25, 2006 7:12 AM
To: user-ae9b8668bcde@xymon.invalid
Subject: Re: [hobbit] Is there a way to "quietly" disable hosts that have NOTICE set?

That sends out NOTICE alerts though. If you are on call, do you want to
get paged at 3am every night when those services are put into MAINT
mode? Thats why I asked if there was a way to do a "silent" disable. I
basically need something that works exactly like the DOWNTIME option in
bb-hosts, except interactively.

-Charles

Francesco Duranti wrote:
Directly from the man bb command :D


       disable HOSTNAME.TESTNAME DURATION <additional text>
              Disables  a  specific  test  for  DURATION minutes. This
will cause the status of this test to be listed as
              "blue" on the BBDISPLAY server, and no alerts for this
host/test will be generated. If DURATION is given as
              a  number  followed by s/m/h/d, it is interpreted as being
in seconds/minutes/hours/days respectively.   To
              disable all tests for a host, use an asterisk "*" for
TESTNAME.

       enable HOSTNAME.TESTNAME
              Re-enables a test that had been disabled.


So you can execute bb hobbitserver "disable dbservername.* 2h Backup
time" before shutting down the database and bb hobbitserver "enable
dbserver.*" just at the end.

Francesco

-----Original Message-----
From: Charles Jones [mailto:user-e86b4aeade4e@xymon.invalid]
Sent: Monday, September 25, 2006 7:56 AM
To: user-ae9b8668bcde@xymon.invalid
Subject: [hobbit] Is there a way to "quietly" disable hosts
that have NOTICE set?

I have the NOTICE flag set for all of my production hosts - I
want pages to go out if someone disables or enables any of them.

However, when backups are done, the oracle databases are
brought down, which triggers an alert. If they are manually
disabled, the NOTICE message goes out which also wakes people
up for no reason.

If I use DOWNTIME in bb-hosts, then I have to specify a
window which is guaranteed to be longer than the possible
time it could take to backup the databases (which is a
dynamic thing which will surely be wrong from time to time).
So what ends up happening is for example, I would specify an
hour of DOWNTIME, but the backups sometimes only take 30 minutes.
That means there is a 30 minute window where a real alert
would be masked, which is unacceptable in a production environment.

I guess what I'm looking for, is a way that I can send a
commands to Hobbit via a shellscript (called from the db
backup script), that would put a host/services in maint mode
(disabled - blue dot), and NOT send a NOTICE page.

-Charles

list Charles Jones · Mon, 25 Sep 2006 09:28:23 -0700 ·
Okay, let me explain this again :)

If I ignore pages for a specific time interval, say 2 hours while 
backups are being done, then every minute that the backups complete 
before that interval is over, is a minute that the database is 
essentially not monitored, which is not acceptable.

Example:

via DOWNTIME in bb-hosts, or via TIME specifications in 
hobbit-alerts.cfg as you mention,
[----------window where no pages are sent for red status of "oracle" or 
"procs"------------]
[-----------------database backups--------------][------ something can 
break here -------]

My problem is that something can break during that post-backup interval, 
and nobody will be notified. Making the interval shorter is not an 
option because due to the nature of backups, the amount of time it takes 
to do the backup fluctuates on a daily basis. It could take 30 minutes 
monday night, an hour tuesday night, 10 minutes wednesday night, 73 
minutes thurs.. Using the bb disable command from a script is not a good 
solution because that makes NOTICE alerts get sent out, which wakes 
people up for no reason.

The DOWNTIME option in bb-hosts has the ability to set the status of 
services blue without sending NOTICE alerts. I would like this same 
ability on an interactive basis (via a message to hobbit using the bb 
command).

-Charles
quoted from Adam Scheblein

Scheblein, Adam wrote:
we have put in the following because every night we have batch jobs that run that use almost all the processor power in our server:

SERVICE=cpu
MAIL [e-mail address] COLOR=red DURATION=15 TIME=*:0900:1700 HOST=[hostname]

by changing the TIME field to suit you, you will not get paiged, you just have to put an EXHOST=[hostname] in your normal alert rules.

Adam

From: Charles Jones
Sent: Monday, September 25, 2006 7:12 AM
To: user-ae9b8668bcde@xymon.invalid
Subject: Re: [hobbit] Is there a way to "quietly" disable hosts that have NOTICE set?

That sends out NOTICE alerts though. If you are on call, do you want to
get paged at 3am every night when those services are put into MAINT
mode? Thats why I asked if there was a way to do a "silent" disable. I
basically need something that works exactly like the DOWNTIME option in
bb-hosts, except interactively.

-Charles

Francesco Duranti wrote:
  
Directly from the man bb command :D


       disable HOSTNAME.TESTNAME DURATION <additional text>
              Disables  a  specific  test  for  DURATION minutes. This
will cause the status of this test to be listed as
              "blue" on the BBDISPLAY server, and no alerts for this
host/test will be generated. If DURATION is given as
              a  number  followed by s/m/h/d, it is interpreted as being
in seconds/minutes/hours/days respectively.   To
              disable all tests for a host, use an asterisk "*" for
TESTNAME.

       enable HOSTNAME.TESTNAME
              Re-enables a test that had been disabled.


So you can execute bb hobbitserver "disable dbservername.* 2h Backup
time" before shutting down the database and bb hobbitserver "enable
dbserver.*" just at the end.

Francesco

-----Original Message-----
From: Charles Jones [mailto:user-e86b4aeade4e@xymon.invalid]
Sent: Monday, September 25, 2006 7:56 AM
To: user-ae9b8668bcde@xymon.invalid
Subject: [hobbit] Is there a way to "quietly" disable hosts
that have NOTICE set?

I have the NOTICE flag set for all of my production hosts - I
want pages to go out if someone disables or enables any of them.

However, when backups are done, the oracle databases are
brought down, which triggers an alert. If they are manually
disabled, the NOTICE message goes out which also wakes people
up for no reason.

If I use DOWNTIME in bb-hosts, then I have to specify a
window which is guaranteed to be longer than the possible
time it could take to backup the databases (which is a
dynamic thing which will surely be wrong from time to time).
So what ends up happening is for example, I would specify an
hour of DOWNTIME, but the backups sometimes only take 30 minutes.
That means there is a 30 minute window where a real alert
would be masked, which is unacceptable in a production environment.

I guess what I'm looking for, is a way that I can send a
commands to Hobbit via a shellscript (called from the db
backup script), that would put a host/services in maint mode
(disabled - blue dot), and NOT send a NOTICE page.

-Charles

list Ralph Mitchell · Mon, 25 Sep 2006 11:45:50 -0500 ·
quoted from Charles Jones
On 9/25/06, Charles Jones <user-e86b4aeade4e@xymon.invalid> wrote:
 The DOWNTIME option in bb-hosts has the ability to set the status of
services blue without sending NOTICE alerts. I would like this same ability
on an interactive basis (via a message to hobbit using the bb command).
So, essentially you need DOWNTIME to have something like the "until
green again" option that was added to the enable/disable page in 4.2,
right??

Unless I've completely missed your point, for which I apologise in advance...:)

Ralph Mitchell
list Francesco Duranti · Mon, 25 Sep 2006 18:55:51 +0200 ·
At this moment it's not possible to do anything like that I think ...
The only 3 alternatives I see at this moment are:
1) writing a script that will change the DOWNTIME inside the bb-hosts
file at the start and at the end of the backup, hobbit should read it
for the next test and not send alert or notice
2) You can enable the NOTICE alert only during the non backup time so
for example if you do backup between 02:00-05:00 you can enable NOTICE
only from 0-2 and from 5-24 and sending a disable via bb command at the
start of the backup. With this you'll not know if someone will disable
the text during the backup hour but at least you get the alert if the
host is down (better then having hobbit not alert on a fault probably). 
3) At least for the "oracle" test if it's local to the oracle database
you can also change it to check for example for the existance of a file
named something like /oracle/backup_now and if that file exist change
the script to not do any checks and send a clear state to hobbit. For
the procs to do the same I think you've to modify the hobbit client too
and I don't know how simple it can be.

Francesco
quoted from Charles Jones


	From: Charles Jones [mailto:user-e86b4aeade4e@xymon.invalid] 
	Sent: Monday, September 25, 2006 6:28 PM
	To: user-ae9b8668bcde@xymon.invalid
	Subject: Re: [hobbit] Is there a way to "quietly" disable hosts
that have NOTICE set?
	
	
	Okay, let me explain this again :)
	
	If I ignore pages for a specific time interval, say 2 hours
while backups are being done, then every minute that the backups
complete before that interval is over, is a minute that the database is
essentially not monitored, which is not acceptable.
	
	Example:
	
	via DOWNTIME in bb-hosts, or via TIME specifications in
hobbit-alerts.cfg as you mention, 
	[----------window where no pages are sent for red status of
"oracle" or "procs"------------]
	[-----------------database backups--------------][------
something can break here -------]
	
	My problem is that something can break during that post-backup
interval, and nobody will be notified. Making the interval shorter is
not an option because due to the nature of backups, the amount of time
it takes to do the backup fluctuates on a daily basis. It could take 30
minutes monday night, an hour tuesday night, 10 minutes wednesday night,
73 minutes thurs.. Using the bb disable command from a script is not a
good solution because that makes NOTICE alerts get sent out, which wakes
people up for no reason.
	
	The DOWNTIME option in bb-hosts has the ability to set the
status of services blue without sending NOTICE alerts. I would like this
same ability on an interactive basis (via a message to hobbit using the
bb command).
	
	-Charles
	
	Scheblein, Adam wrote: 

		we have put in the following because every night we have
batch jobs that run that use almost all the processor power in our
server:
		
		SERVICE=cpu
		MAIL [e-mail address] COLOR=red DURATION=15
TIME=*:0900:1700 HOST=[hostname]
		
		by changing the TIME field to suit you, you will not get
paiged, you just have to put an EXHOST=[hostname] in your normal alert
rules.
		
		Adam
		
		From: Charles Jones
		Sent: Monday, September 25, 2006 7:12 AM
		To: user-ae9b8668bcde@xymon.invalid
		Subject: Re: [hobbit] Is there a way to "quietly"
disable hosts that have NOTICE set?
		
		That sends out NOTICE alerts though. If you are on call,
do you want to
		get paged at 3am every night when those services are put
into MAINT
		mode? Thats why I asked if there was a way to do a
"silent" disable. I
		basically need something that works exactly like the
DOWNTIME option in
		bb-hosts, except interactively.
		
		-Charles
		
		Francesco Duranti wrote:
		  

			Directly from the man bb command :D
			
			
			       disable HOSTNAME.TESTNAME DURATION
<additional text>
			              Disables  a  specific  test  for
DURATION minutes. This
			will cause the status of this test to be listed
as
			              "blue" on the BBDISPLAY server,
and no alerts for this
			host/test will be generated. If DURATION is
given as
			              a  number  followed by s/m/h/d, it
is interpreted as being
			in seconds/minutes/hours/days respectively.   To
			              disable all tests for a host, use
an asterisk "*" for
			TESTNAME.
			
			       enable HOSTNAME.TESTNAME
			              Re-enables a test that had been
disabled.
			
			
			So you can execute bb hobbitserver "disable
dbservername.* 2h Backup
			time" before shutting down the database and bb
hobbitserver "enable
			dbserver.*" just at the end.
			
			Francesco
			
			
				-----Original Message-----
				From: Charles Jones
[mailto:user-e86b4aeade4e@xymon.invalid]
				Sent: Monday, September 25, 2006 7:56 AM
				To: user-ae9b8668bcde@xymon.invalid
				Subject: [hobbit] Is there a way to
"quietly" disable hosts
				that have NOTICE set?
				
				I have the NOTICE flag set for all of my
production hosts - I
				want pages to go out if someone disables
or enables any of them.
				
				However, when backups are done, the
oracle databases are
				brought down, which triggers an alert.
If they are manually
				disabled, the NOTICE message goes out
which also wakes people
				up for no reason.
				
				If I use DOWNTIME in bb-hosts, then I
have to specify a
				window which is guaranteed to be longer
than the possible
				time it could take to backup the
databases (which is a
				dynamic thing which will surely be wrong
from time to time).
				So what ends up happening is for
example, I would specify an
				hour of DOWNTIME, but the backups
sometimes only take 30 minutes.
				That means there is a 30 minute window
where a real alert
				would be masked, which is unacceptable
in a production environment.
				
				I guess what I'm looking for, is a way
that I can send a
				commands to Hobbit via a shellscript
(called from the db
				backup script), that would put a
host/services in maint mode
				(disabled - blue dot), and NOT send a
NOTICE page.
				
				-Charles
list Charles Jones · Mon, 25 Sep 2006 10:03:05 -0700 ·
Yep...I tried doing a disable for -1 (until OK), but if you do that while the service is still green, it immediately recovers.
quoted from Ralph Mitchell

Ralph Mitchell wrote:
On 9/25/06, Charles Jones <user-e86b4aeade4e@xymon.invalid> wrote:
 The DOWNTIME option in bb-hosts has the ability to set the status of
services blue without sending NOTICE alerts. I would like this same ability
on an interactive basis (via a message to hobbit using the bb command).
So, essentially you need DOWNTIME to have something like the "until
green again" option that was added to the enable/disable page in 4.2,
right??

Unless I've completely missed your point, for which I apologise in advance...:)

Ralph Mitchell

list Pnixon · Mon, 25 Sep 2006 13:04:33 -0400 ·
Have your script disable it every five minutes or so until the backup is
done? 
quoted from Charles Jones
-----Original Message-----
From: Charles Jones [mailto:user-e86b4aeade4e@xymon.invalid] Sent: Monday, September 25, 2006 1:03 PM
To: user-ae9b8668bcde@xymon.invalid
Subject: Re: [hobbit] Is there a way to "quietly" disable hosts that have
NOTICE set?

Yep...I tried doing a disable for -1 (until OK), but if you do that while
the service is still green, it immediately recovers.

Ralph Mitchell wrote:
On 9/25/06, Charles Jones <user-e86b4aeade4e@xymon.invalid> wrote:
 The DOWNTIME option in bb-hosts has the ability to set the status of services blue without sending NOTICE alerts. I would like this same ability on an interactive basis (via a message to hobbit using the bb command).
So, essentially you need DOWNTIME to have something like the "until green again" option that was added to the enable/disable page in 4.2, right??

Unless I've completely missed your point, for which I apologise in
advance...:)

Ralph Mitchell

list Charles Jones · Mon, 25 Sep 2006 10:21:38 -0700 ·
If you are oncall, do you want to get woken up by the DISABLE notice alerts every 5 mins at 3am (or even one of them)? :-)

Before you say "don't use NOTICE in hobbit-alerts.cfg then", consider that when things are purposefully disabled for reasons other than scheduled downtime, folks want to know about it.

-Charles
quoted from Pnixon

user-c102b8958c7a@xymon.invalid wrote:
Have your script disable it every five minutes or so until the backup is
done? 
-----Original Message-----
From: Charles Jones [mailto:user-e86b4aeade4e@xymon.invalid] Sent: Monday, September 25, 2006 1:03 PM
To: user-ae9b8668bcde@xymon.invalid
Subject: Re: [hobbit] Is there a way to "quietly" disable hosts that have
NOTICE set?

Yep...I tried doing a disable for -1 (until OK), but if you do that while
the service is still green, it immediately recovers.

Ralph Mitchell wrote:
  
On 9/25/06, Charles Jones <user-e86b4aeade4e@xymon.invalid> wrote:
    
 The DOWNTIME option in bb-hosts has the ability to set the status of services blue without sending NOTICE alerts. I would like this same ability on an interactive basis (via a message to hobbit using the bb command).

      
So, essentially you need DOWNTIME to have something like the "until green again" option that was added to the enable/disable page in 4.2, right??

Unless I've completely missed your point, for which I apologise in
advance...:)

Ralph Mitchell

list Pnixon · Mon, 25 Sep 2006 13:30:30 -0400 ·
Here's another solution....
 Write a script to handle the paging for this server and service (oracle).
It checks to see if the backup is running, if so, don't page, if not, page.
 --Pat
quoted from Charles Jones

From: Charles Jones [mailto:user-e86b4aeade4e@xymon.invalid] Sent: Monday, September 25, 2006 1:22 PM
To: user-ae9b8668bcde@xymon.invalid
Subject: Re: [hobbit] Is there a way to "quietly" disable hosts that have
NOTICE set?


If you are oncall, do you want to get woken up by the DISABLE notice alerts
every 5 mins at 3am (or even one of them)? :-)

Before you say "don't use NOTICE in hobbit-alerts.cfg then", consider that
when things are purposefully disabled for reasons other than scheduled
downtime, folks want to know about it.

-Charles

user-c102b8958c7a@xymon.invalid <mailto:user-c102b8958c7a@xymon.invalid>  wrote: 
Have your script disable it every five minutes or so until the backup is

done? 


-----Original Message-----

From: Charles Jones [mailto:user-e86b4aeade4e@xymon.invalid <mailto:user-e86b4aeade4e@xymon.invalid> ] 
Sent: Monday, September 25, 2006 1:03 PM

To: user-ae9b8668bcde@xymon.invalid <mailto:user-ae9b8668bcde@xymon.invalid> 
Subject: Re: [hobbit] Is there a way to "quietly" disable hosts that have

NOTICE set?


Yep...I tried doing a disable for -1 (until OK), but if you do that while

the service is still green, it immediately recovers.


Ralph Mitchell wrote:

  

On 9/25/06, Charles Jones  <mailto:user-e86b4aeade4e@xymon.invalid> <user-e86b4aeade4e@xymon.invalid>
quoted from Charles Jones
wrote:

    
 The DOWNTIME option in bb-hosts has the ability to set the status of 
services blue without sending NOTICE alerts. I would like this same 
ability on an interactive basis (via a message to hobbit using the bb 
command).


So, essentially you need DOWNTIME to have something like the "until 
green again" option that was added to the enable/disable page in 4.2, 
right??


Unless I've completely missed your point, for which I apologise in

advance...:)


Ralph Mitchell

list Tom Kauffman · Mon, 25 Sep 2006 14:02:52 -0400 ·
Does your oracle test run on the same box?

We're running a hacked version of the 'roracle' script from deadcat; one
of the hacks is to report 'blue' if the backup is running. And our
backup script sets a specific lockfile at the start, removing it at the
end. And we monitor the lockfile existence, to alert on a backup running
too long.

Maybe this would be a better approach? Possible send 'clear' instead of
'blue', if the 'blue' triggers the notice?

Tom Kauffman
NIBCO, Inc
quoted from Charles Jones

-----Original Message-----
From: Charles Jones [mailto:user-e86b4aeade4e@xymon.invalid] Sent: Monday, September 25, 2006 1:56 AM
To: user-ae9b8668bcde@xymon.invalid
Subject: [hobbit] Is there a way to "quietly" disable hosts that have
NOTICE set?

I have the NOTICE flag set for all of my production hosts - I want pages

to go out if someone disables or enables any of them.

However, when backups are done, the oracle databases are brought down, which triggers an alert. If they are manually disabled, the NOTICE message goes out which also wakes people up for no reason.

If I use DOWNTIME in bb-hosts, then I have to specify a window which is guaranteed to be longer than the possible time it could take to backup the databases (which is a dynamic thing which will surely be wrong from time to time). So what ends up happening is for example, I would specify

an hour of DOWNTIME, but the backups sometimes only take 30 minutes. That means there is a 30 minute window where a real alert would be masked, which is unacceptable in a production environment.

I guess what I'm looking for, is a way that I can send a commands to Hobbit via a shellscript (called from the db backup script), that would put a host/services in maint mode (disabled - blue dot), and NOT send a NOTICE page.

-Charles


CONFIDENTIALITY NOTICE:  This email and any attachments are for the exclusive and confidential use of the intended recipient.  If you are not
the intended recipient, please do not read, distribute or take action in reliance upon this message. If you have received this in error, please notify us immediately by return email and promptly delete this message and its attachments from your computer system. We do not waive  attorney-client or work product privilege by the transmission of this
message.
list Buchan Milne · Tue, 26 Sep 2006 13:21:50 +0200 ·
quoted from Charles Jones
On Monday 25 September 2006 07:56, Charles Jones wrote:
I have the NOTICE flag set for all of my production hosts - I want pages
to go out if someone disables or enables any of them.

However, when backups are done, the oracle databases are brought down,
which triggers an alert. If they are manually disabled, the NOTICE
message goes out which also wakes people up for no reason.

If I use DOWNTIME in bb-hosts, then I have to specify a window which is
guaranteed to be longer than the possible time it could take to backup
the databases (which is a dynamic thing which will surely be wrong from
time to time). So what ends up happening is for example, I would specify
an hour of DOWNTIME, but the backups sometimes only take 30 minutes.
That means there is a 30 minute window where a real alert
for a test being disabled, during scheduled downtime,
would be 
masked, which is unacceptable in a production environment.
hmm, in our environment, it is acceptable for tests to be disabled for more 
than the time required for a change if it is within the scheduled downtime 
for the change.

Of course, the fact that a test was disabled would still be logged by hobbit.
quoted from Tom Kauffman
I guess what I'm looking for, is a way that I can send a commands to
Hobbit via a shellscript (called from the db backup script), that would
put a host/services in maint mode (disabled - blue dot), and NOT send a
NOTICE page.
If you set the TIME on your NOTICE alerts to avoid notifications during 
downtime, notifications of notify messages will not be sent. Of course, you 
could have a NOTICE alert that covers this time but does not page (to track 
anything that does occur).

BTW, it might help if you include some information on how your alerting is set 
up, eg. line from bb-hosts and any matching rules from hobbit-alerts.cfg.

Regards,
Buchan


-- 
Buchan Milne
ISP Systems Specialist
B.Eng,RHCE(803004789010797),LPIC-2(LPI000074592)
list Larry Barber · Tue, 26 Sep 2006 11:09:15 -0500 ·
How about something like this:

1. create a "test" named something like "backup". No alerts should be
configured for this test, you can use NOPROPRED or DOWNTIME to prevent it
from showing up as red and causing anxiety.
2. make your normal db tests "DEPEND" on this new test.
3. have your backup scripts send a command to hobbit server setting it to
red at the start and green at the finish, be sure to set the status duration
to be long enough to prevent purples

This should cause your normal db tests to go clear during backups, but
restore normal operation after the backups complete.

Thanks,
Larry Barber
quoted from Buchan Milne

On 9/26/06, Buchan Milne <user-9b139aff4dec@xymon.invalid> wrote:
On Monday 25 September 2006 07:56, Charles Jones wrote:
I have the NOTICE flag set for all of my production hosts - I want pages
to go out if someone disables or enables any of them.

However, when backups are done, the oracle databases are brought down,
which triggers an alert. If they are manually disabled, the NOTICE
message goes out which also wakes people up for no reason.

If I use DOWNTIME in bb-hosts, then I have to specify a window which is
guaranteed to be longer than the possible time it could take to backup
the databases (which is a dynamic thing which will surely be wrong from
time to time). So what ends up happening is for example, I would specify
an hour of DOWNTIME, but the backups sometimes only take 30 minutes.
That means there is a 30 minute window where a real alert
for a test being disabled, during scheduled downtime,
would be
masked, which is unacceptable in a production environment.
hmm, in our environment, it is acceptable for tests to be disabled for
more
than the time required for a change if it is within the scheduled downtime
for the change.

Of course, the fact that a test was disabled would still be logged by
hobbit.
I guess what I'm looking for, is a way that I can send a commands to
Hobbit via a shellscript (called from the db backup script), that would
put a host/services in maint mode (disabled - blue dot), and NOT send a
NOTICE page.
If you set the TIME on your NOTICE alerts to avoid notifications during
downtime, notifications of notify messages will not be sent. Of course,
you
could have a NOTICE alert that covers this time but does not page (to
track
anything that does occur).

BTW, it might help if you include some information on how your alerting is
set
up, eg. line from bb-hosts and any matching rules from hobbit-alerts.cfg.

Regards,
Buchan


--
Buchan Milne
ISP Systems Specialist
B.Eng,RHCE(803004789010797),LPIC-2(LPI000074592)