Xymon Mailing List Archive search

Alerts on server reboot?

8 messages in this thread

list Stewart Larsen · Tue, 10 Jul 2007 10:44:39 -0400 ·
Is there a way to suppress the flood of alerts we get on a server
reboot?  If the machine is down for 10 minutes and comes back up, we get
flooded with purple messages.

--
Stewart
list Tod Hansmann · Tue, 10 Jul 2007 13:05:02 -0600 ·
Purple messages on what?  Connection alarms should only go red, maybe
for a couple polling cycles.  Services, I'd imagine, would come back up
with the server.

Regardless, you can suppress alarms for a given host and/or service on
that host with several ways.  The two that come to mind for me are:
 - Use the DOWNTIME directive in bb-hosts (see the man-page)
 - Use the web interface to disable the host tests for however long you
need when you need to restart a server.

The alarms will still be logged, but won't show up or flow up the chain
to bb2.html or whatnot.

Hope that helps.

Tod Hansmann
Network Engineer
quoted from Stewart Larsen
 
 
-----Original Message-----
From: Stewart [mailto:user-4bb0ef2a7550@xymon.invalid] 
Sent: Tuesday, July 10, 2007 8:45 AM
To: user-ae9b8668bcde@xymon.invalid
Subject: [hobbit] Alerts on server reboot?

Is there a way to suppress the flood of alerts we get on a server
reboot?  If the machine is down for 10 minutes and comes back up, we get
flooded with purple messages.

--
Stewart
list Stewart Larsen · Tue, 10 Jul 2007 15:50:57 -0400 ·
Re-read my post and I wasn't clear.  This is the Hobbit server going
down. Not the client machines.

It seems that when the server starts back up after a reboot, it goes,
"My god!  I have not seen an update for all 4000+ hosts in over an hour!
  I should send out purple alerts so that the admin can look into this!"

So the issue is the duration of the Hobbit server downtime.   If it's
down too long, in its mind, it has not received an update in a very long
time, so it sends purple messages for everything it owns.

DOWNTIME won't work, because this is the Hobbit server going down.
I also can't disable the hobbit host tests because, well, my server is
down. :)

Stewart
quoted from Tod Hansmann


Tod Hansmann wrote:
Purple messages on what?  Connection alarms should only go red, maybe
for a couple polling cycles.  Services, I'd imagine, would come back up
with the server.

Regardless, you can suppress alarms for a given host and/or service on
that host with several ways.  The two that come to mind for me are:
 - Use the DOWNTIME directive in bb-hosts (see the man-page)
 - Use the web interface to disable the host tests for however long you
need when you need to restart a server.

The alarms will still be logged, but won't show up or flow up the chain
to bb2.html or whatnot.

Hope that helps.

Tod Hansmann
Network Engineer


-----Original Message-----
From: Stewart [mailto:user-4bb0ef2a7550@xymon.invalid]
Sent: Tuesday, July 10, 2007 8:45 AM
To: user-ae9b8668bcde@xymon.invalid
Subject: [hobbit] Alerts on server reboot?

Is there a way to suppress the flood of alerts we get on a server
reboot?  If the machine is down for 10 minutes and comes back up, we get
flooded with purple messages.

--
Stewart

--

Stewart Larsen
--
This sig intentionally left blank, other than this text explaining that
if not for this text, this sig would be blank.
list Charles Jones · Tue, 10 Jul 2007 16:42:57 -0700 ·
Last time I faced this scenario, before I started the Hobbit server I 
edited hobbit-alerts.cfg and temporarily commented out the MAIL lines :)
Of course this doesn't help you if you have hobbit set to automatically 
start when the server boots. :)

-Charles
quoted from Stewart Larsen

Stewart wrote:
Re-read my post and I wasn't clear.  This is the Hobbit server going
down. Not the client machines.

It seems that when the server starts back up after a reboot, it goes,
"My god!  I have not seen an update for all 4000+ hosts in over an hour!
 I should send out purple alerts so that the admin can look into this!"

So the issue is the duration of the Hobbit server downtime.   If it's
down too long, in its mind, it has not received an update in a very long
time, so it sends purple messages for everything it owns.

DOWNTIME won't work, because this is the Hobbit server going down.
I also can't disable the hobbit host tests because, well, my server is
down. :)

Stewart


Tod Hansmann wrote:
Purple messages on what?  Connection alarms should only go red, maybe
for a couple polling cycles.  Services, I'd imagine, would come back up
with the server.

Regardless, you can suppress alarms for a given host and/or service on
that host with several ways.  The two that come to mind for me are:
 - Use the DOWNTIME directive in bb-hosts (see the man-page)
 - Use the web interface to disable the host tests for however long you
need when you need to restart a server.

The alarms will still be logged, but won't show up or flow up the chain
to bb2.html or whatnot.

Hope that helps.

Tod Hansmann
Network Engineer


-----Original Message-----
From: Stewart [mailto:user-4bb0ef2a7550@xymon.invalid]
Sent: Tuesday, July 10, 2007 8:45 AM
To: user-ae9b8668bcde@xymon.invalid
Subject: [hobbit] Alerts on server reboot?

Is there a way to suppress the flood of alerts we get on a server
reboot?  If the machine is down for 10 minutes and comes back up, we get
flooded with purple messages.

-- 
Stewart

-- 
Stewart Larsen
-- 
This sig intentionally left blank, other than this text explaining that
if not for this text, this sig would be blank.

list Stewart Larsen · Tue, 10 Jul 2007 20:59:08 -0400 ·
so, is there nothing I can do about this?  I thought I read at one point
that there was a way to tell the server, "Don't send any alerts within
10 minutes of server startup", or something like that.

Stewart
quoted from Charles Jones


Charles Jones wrote:
Last time I faced this scenario, before I started the Hobbit server I
edited hobbit-alerts.cfg and temporarily commented out the MAIL lines :)
Of course this doesn't help you if you have hobbit set to automatically
start when the server boots. :)

-Charles

Stewart wrote:
Re-read my post and I wasn't clear.  This is the Hobbit server going
down. Not the client machines.

It seems that when the server starts back up after a reboot, it goes,
"My god!  I have not seen an update for all 4000+ hosts in over an hour!
 I should send out purple alerts so that the admin can look into this!"

So the issue is the duration of the Hobbit server downtime.   If it's
down too long, in its mind, it has not received an update in a very long
time, so it sends purple messages for everything it owns.

DOWNTIME won't work, because this is the Hobbit server going down.
I also can't disable the hobbit host tests because, well, my server is
down. :)

Stewart


Tod Hansmann wrote:
Purple messages on what?  Connection alarms should only go red, maybe
for a couple polling cycles.  Services, I'd imagine, would come back up
with the server.

Regardless, you can suppress alarms for a given host and/or service on
that host with several ways.  The two that come to mind for me are:
 - Use the DOWNTIME directive in bb-hosts (see the man-page)
 - Use the web interface to disable the host tests for however long you
need when you need to restart a server.

The alarms will still be logged, but won't show up or flow up the chain
to bb2.html or whatnot.

Hope that helps.

Tod Hansmann
Network Engineer


-----Original Message-----
From: Stewart [mailto:user-4bb0ef2a7550@xymon.invalid]
Sent: Tuesday, July 10, 2007 8:45 AM
To: user-ae9b8668bcde@xymon.invalid
Subject: [hobbit] Alerts on server reboot?

Is there a way to suppress the flood of alerts we get on a server
reboot?  If the machine is down for 10 minutes and comes back up, we get
flooded with purple messages.

--
Stewart

--
Stewart Larsen
--
This sig intentionally left blank, other than this text explaining that
if not for this text, this sig would be blank.

--
Stewart Larsen
--
This sig intentionally left blank, other than this text explaining that
if not for this text, this sig would be blank.
list Henrik Størner · Wed, 11 Jul 2007 07:30:29 +0200 ·
quoted from Stewart Larsen
On Tue, Jul 10, 2007 at 08:59:08PM -0400, Stewart wrote:
It seems that when the server starts back up after a reboot, it goes,
"My god!  I have not seen an update for all 4000+ hosts in over an hour!
I should send out purple alerts so that the admin can look into this!"

So the issue is the duration of the Hobbit server downtime.   If it's
down too long, in its mind, it has not received an update in a very long
time, so it sends purple messages for everything it owns.
so, is there nothing I can do about this?  I thought I read at one point
that there was a way to tell the server, "Don't send any alerts within
10 minutes of server startup", or something like that.
It's built into Hobbit that it won't change a status to purple for the
first 10 minutes after it starts up. That should give everything time to
refresh before the purple storm hits.

So what you're saying is that this doesn't work - I'll have to try and
re-create this here.


Regards,
Henrik
list Stewart Larsen · Wed, 11 Jul 2007 07:10:27 -0400 ·
If it matters, we're running this in redhat Enterprise.
quoted from Henrik Størner

Henrik Stoerner wrote:
On Tue, Jul 10, 2007 at 08:59:08PM -0400, Stewart wrote:
It seems that when the server starts back up after a reboot, it goes,
"My god!  I have not seen an update for all 4000+ hosts in over an hour!
I should send out purple alerts so that the admin can look into this!"

So the issue is the duration of the Hobbit server downtime.   If it's
down too long, in its mind, it has not received an update in a very long
time, so it sends purple messages for everything it owns.
so, is there nothing I can do about this?  I thought I read at one point
that there was a way to tell the server, "Don't send any alerts within
10 minutes of server startup", or something like that.
It's built into Hobbit that it won't change a status to purple for the
first 10 minutes after it starts up. That should give everything time to
refresh before the purple storm hits.

So what you're saying is that this doesn't work - I'll have to try and
re-create this here.


Regards,
Henrik

--
Stewart Larsen
--
This sig intentionally left blank, other than this text explaining that
if not for this text, this sig would be blank.
list Larry Barber · Mon, 16 Jul 2007 14:01:00 -0500 ·
You can always turn off hobbitd_alert for a few minutes in hobbitlaunch.cfg.


Thanks,
Larry Barber
quoted from Stewart Larsen

On 7/11/07, Stewart <user-4bb0ef2a7550@xymon.invalid> wrote:
If it matters, we're running this in redhat Enterprise.

Henrik Stoerner wrote:
On Tue, Jul 10, 2007 at 08:59:08PM -0400, Stewart wrote:
It seems that when the server starts back up after a reboot, it goes,
"My god!  I have not seen an update for all 4000+ hosts in over an
hour!
I should send out purple alerts so that the admin can look into
this!"

So the issue is the duration of the Hobbit server downtime.   If it's
down too long, in its mind, it has not received an update in a very
long
time, so it sends purple messages for everything it owns.
so, is there nothing I can do about this?  I thought I read at one
point
that there was a way to tell the server, "Don't send any alerts within
10 minutes of server startup", or something like that.
It's built into Hobbit that it won't change a status to purple for the
first 10 minutes after it starts up. That should give everything time to
refresh before the purple storm hits.

So what you're saying is that this doesn't work - I'll have to try and
re-create this here.


Regards,
Henrik

--
Stewart Larsen
--
This sig intentionally left blank, other than this text explaining that
if not for this text, this sig would be blank.