Xymon Mailing List Archive search

single alert for a host going puprle

6 messages in this thread

list Hezki Englander · Tue, 13 Nov 2007 16:54:29 +0200 ·
Hi

When a client is down all monitored services will get purple.
This is usually causing multiple alerts to be sent for each service.
Recipients can get really annoyed by the amount of alerts while they only
care to know that the client should be restarted.

Is there a way to get just a single alert telling you the client stopped
running, or at least just one purple event ?
Has anyone configured the alerts file to prevent this situation?

Thanks,

Hezki
list Josh Luthman · Tue, 13 Nov 2007 11:24:30 -0500 ·
Let me understand your situation.  You have one host, IE:

1.2.3.4 mybroken.host.com # ssh dns pop3 smtp http://mybrokenhost.com

Now after some time you have ssh and dns going down, but the box is still
green on conn/ping?  Logically, there are two problems - bad bind and bad
opensshd, so two alerts is a wise choice.

If you are saying with the example above that every test goes down because
it is offline, in my case at least, it only gives me one alert - bad conn,
the rest are tagged with "depend" essentially and don't alert you.  Is this
not the case for you?
quoted from Hezki Englander

On 11/13/07, Hezki Englander <user-c7a77f4ca55b@xymon.invalid> wrote:
Hi

When a client is down all monitored services will get purple.
This is usually causing multiple alerts to be sent for each service.
Recipients can get really annoyed by the amount of alerts while they only
care to know that the client should be restarted.

Is there a way to get just a single alert telling you the client stopped
running, or at least just one purple event ?
Has anyone configured the alerts file to prevent this situation?

Thanks,

Hezki
-- 

Josh Luthman
Office: XXX-XXX-XXXX
Direct: XXX-XXX-XXXX
XXXX Wayne St
Suite XXXX
Troy, OH XXXXX

Those who don't understand UNIX are condemned to reinvent it, poorly.
--- Henry Spencer
list Frédéric Mangeant · Tue, 13 Nov 2007 17:29:52 +0100 ·
Josh Luthman a écrit :
Let me understand your situation.  You have one host, IE:

1.2.3.4 <http://1.2.3.4>; mybroken.host.com <http://mybroken.host.com>; # ssh dns pop3 smtp http://mybrokenhost.com <http://mybrokenhost.com>;
quoted from Josh Luthman

Now after some time you have ssh and dns going down, but the box is still green on conn/ping?  Logically, there are two problems - bad bind and bad opensshd, so two alerts is a wise choice.

If you are saying with the example above that every test goes down because it is offline, in my case at least, it only gives me one alert - bad conn, the rest are tagged with "depend" essentially and don't alert you.  Is this not the case for you?

Hi

there are various situations where a host responds to ping but does not send any status to hobbitd :
- firewall rule modified, so no connection from a server to the hobbitd daemon allowed
- Hobbit client not run automatically after system restart
- Hobbit client stopped, hung, etc.

-- 

Frédéric Mangeant

Steria EDC Sophia Antipolis
list Josh Luthman · Tue, 13 Nov 2007 11:48:35 -0500 ·
Well your first problem there can't really be avoided unless the person
modifying the firewall rules is more on top of their game.

The second issue you can easily have it start up when entering runlevel 3
(what a lot of people use out there, remainder of them being 5) but doing an
ln -s /etc/init.d/hobbit /etc/rc3.d/S82hobbit

I have never seen Hobbit crash, but I have only been using it for upwards of
a month or two.  Still, a crashed hobbit won't disallow ICMP echoes.

Josh
quoted from Frédéric Mangeant

On 11/13/07, Frédéric Mangeant <user-b6ea1d850181@xymon.invalid> wrote:
Josh Luthman a écrit :
Let me understand your situation.  You have one host, IE:

1.2.3.4 <http://1.2.3.4>; mybroken.host.com <http://mybroken.host.com>;
# ssh dns pop3 smtp http://mybrokenhost.com <http://mybrokenhost.com>;

Now after some time you have ssh and dns going down, but the box is
still green on conn/ping?  Logically, there are two problems - bad
bind and bad opensshd, so two alerts is a wise choice.

If you are saying with the example above that every test goes down
because it is offline, in my case at least, it only gives me one alert
- bad conn, the rest are tagged with "depend" essentially and don't
alert you.  Is this not the case for you?

Hi

there are various situations where a host responds to ping but does not
send any status to hobbitd :
- firewall rule modified, so no connection from a server to the hobbitd
daemon allowed
- Hobbit client not run automatically after system restart
- Hobbit client stopped, hung, etc.

--

Frédéric Mangeant

Steria EDC Sophia Antipolis

-- 
Josh Luthman
Office: XXX-XXX-XXXX
Direct: XXX-XXX-XXXX
XXXX Wayne St
Suite XXXX
Troy, OH XXXXX

Those who don't understand UNIX are condemned to reinvent it, poorly.
--- Henry Spencer
list Hezki Englander · Tue, 13 Nov 2007 18:53:40 +0200 ·
quoted from Josh Luthman
- firewall rule modified, so no connection from a server to the hobbitd
daemon allowed
- Hobbit client not run automatically after system restart
- Hobbit client stopped, hung, etc.

Yes , I am talking about these cases.
Not about tests done by the server which are "hidden" by conn when there is
no connectivity.
I am talking about a hobbit client not running or not reaching hte server
and then all services monitored localy send many purple alerts .
(cpu,disk,memory,etc..)
quoted from Frédéric Mangeant


On 11/13/07, Frédéric Mangeant <user-b6ea1d850181@xymon.invalid> wrote:
Josh Luthman a écrit :
Let me understand your situation.  You have one host, IE:

1.2.3.4 <http://1.2.3.4>; mybroken.host.com <http://mybroken.host.com>;
# ssh dns pop3 smtp http://mybrokenhost.com <http://mybrokenhost.com>;

Now after some time you have ssh and dns going down, but the box is
still green on conn/ping?  Logically, there are two problems - bad
bind and bad opensshd, so two alerts is a wise choice.

If you are saying with the example above that every test goes down
because it is offline, in my case at least, it only gives me one alert
- bad conn, the rest are tagged with "depend" essentially and don't
alert you.  Is this not the case for you?

Hi

there are various situations where a host responds to ping but does not
send any status to hobbitd :
- firewall rule modified, so no connection from a server to the hobbitd
daemon allowed
- Hobbit client not run automatically after system restart
- Hobbit client stopped, hung, etc.

--

Frédéric Mangeant

Steria EDC Sophia Antipolis

list Hezki Englander · Tue, 13 Nov 2007 19:01:19 +0200 ·
Thanks but I'm not asking about how to avoid having purple dots at all.
When you are monitoring a large amounts of clients, it is quite likely to
have a client not reporting once in a while.
This is why I am asking for a solution to reduce the amount of notifications
per host in case of everything from it is getting purple.
quoted from Josh Luthman


On 11/13/07, Josh Luthman <user-4c45a83f15cb@xymon.invalid> wrote:
Well your first problem there can't really be avoided unless the person
modifying the firewall rules is more on top of their game.

The second issue you can easily have it start up when entering runlevel 3
(what a lot of people use out there, remainder of them being 5) but doing an
ln -s /etc/init.d/hobbit /etc/rc3.d/S82hobbit

I have never seen Hobbit crash, but I have only been using it for upwards
of a month or two.  Still, a crashed hobbit won't disallow ICMP echoes.

Josh

On 11/13/07, Frédéric Mangeant <user-b6ea1d850181@xymon.invalid> wrote:
Josh Luthman a écrit :
Let me understand your situation.  You have one host, IE:

1.2.3.4 <http://1.2.3.4>; mybroken.host.com <http://mybroken.host.com>;
# ssh dns pop3 smtp http://mybrokenhost.com < http://mybrokenhost.com>;

Now after some time you have ssh and dns going down, but the box is
still green on conn/ping?  Logically, there are two problems - bad
bind and bad opensshd, so two alerts is a wise choice.

If you are saying with the example above that every test goes down
because it is offline, in my case at least, it only gives me one alert
- bad conn, the rest are tagged with "depend" essentially and don't
alert you.  Is this not the case for you?

Hi

there are various situations where a host responds to ping but does not
send any status to hobbitd :
- firewall rule modified, so no connection from a server to the hobbitd
daemon allowed
- Hobbit client not run automatically after system restart
- Hobbit client stopped, hung, etc.

--

Frédéric Mangeant

Steria EDC Sophia Antipolis

--
Josh Luthman
Office: XXX-XXX-XXXX
Direct: XXX-XXX-XXXX
XXXX Wayne St
Suite XXXX
Troy, OH XXXXX

Those who don't understand UNIX are condemned to reinvent it, poorly.
--- Henry Spencer