Xymon Mailing List Archive search

Monitoring a simple cluster

6 messages in this thread

list Dan Smith · Thu, 21 Jun 2012 13:17:04 -0400 ·
Red Hat 5.8 server, 5.5 clients.  Xymon 4.3.7 clients and server. 

 
I am trying to put together a quick cluster monitor using depend and combos,
but I think I'm stuck.  Has anyone else done this?

 
I have two servers: smtp01 and smtp02 that are in an active/passive cluster.

The smtpd process only runs on the server that is active, but there are
other processes that need to be monitored on both hosts (e.g. clurgmgrd).

 
I would like to go red and have an alert if one of the required processes
goes down on either server, but I only want an alert if the active node of
the cluster has a problem with the smtpd process.

 
My initial idea was to have procs go red for clurgmgrd and yellow on smtpd,
use NOPROPYELLOW, and then use a combo so I could see what host was active.

 
hosts.cfg:

1.2.3.4    smtp01 # NOPROPYELLOW:procs

1.2.3.5    smtp02 # NOPROPYELLOW:procs

1.2.3.6    smtpHA # smtp

 
analysis.cfg:

HOST=smtp01

            PROC   clurgmgrd

            PROC   smtpd

HOST=smtp02

            PROC   clurgmgrd

            PROC   smtpd

 
combo.cfg:

smtpHA.procs = (smtp01.conn && smtp01.procs) || (smtp02.conn &&
smtp02.procs)

 
Unfortunately a yellow status is equal to a 1 for the combo, so even if both
sides of the cluster were down, the combo would still show as up
(green&&yellow is 1&&1).

 
Then I thought I could cover the scenario by making a "depends" test, but
depends seems to be focused on disabling tests if another status is
red.which doesn't work either since I don't want the page to go red.

 
This isn't a huge issue because the smtp poll on smtpHA will trigger red if
both nodes are down, but it seems like there should be an easy way to do
this.

 
Am I making it more complicated than it needs to be, or am I better writing
a custom monitor?

 
Thanks!

 
-dan
list Bruce White · Thu, 21 Jun 2012 16:18:27 -0500 ·
I have a couple of "traditional" active/passive clusters.  For items
that need to be present on all nodes, I sue the standard monitoring
available with xymon and it's client.  For things that need to only
appear on the active node,  I handle in one of two ways. 

 
Way #1 -  I created striped down Xymon clients which runs with the
application within the cluster.  I have HP service Guard clusters, so
this is easy to do within their frame work of scripts handling most
everything.   I create a new client within the disks which migrate
within the cluster, edit the  xymonclient-<O/S>.sh script within this
new client directory structure and assign an IP/host in the hosts.cfg
which matches the IP which floats within the cluster.  I pull all the
CPU, memory, etc. stuff out of the new xymonclient.sh and focus on the
disks, procs, ports, etc. which tend to be very application specific.
I let standard Xymon take care of the rest.   

 
Way #2 - I setup the IP/Host which matches the floating IP under cluster
control and run scripts from the Xymon server pointing at the floating
IP.  The scripts report status to the Xymon server using the Name
associated with the floating IP, so to the outside observer it looks
like a new server, but its just an application running where it needs to
under cluster control.  

 
For check just SMTP the second might be the way to go, as you could just
use the standard Xymon SMTP test to see that SMTP is running on the IP
which floats.

 
            ......Bruce

 
From: xymon-bounces at xymon.com [mailto:xymon-bounces at xymon.com] On Behalf
Of Dan Smith
Sent: Thursday, June 21, 2012 12:17 PM
To: xymon at xymon.com
Subject: [Xymon] Monitoring a simple cluster
quoted from Dan Smith

 
Red Hat 5.8 server, 5.5 clients.  Xymon 4.3.7 clients and server. 

 
I am trying to put together a quick cluster monitor using depend and
combos, but I think I'm stuck.  Has anyone else done this?

 
I have two servers: smtp01 and smtp02 that are in an active/passive
cluster.

The smtpd process only runs on the server that is active, but there are
other processes that need to be monitored on both hosts (e.g.
clurgmgrd).

 
I would like to go red and have an alert if one of the required
processes goes down on either server, but I only want an alert if the
active node of the cluster has a problem with the smtpd process.

 
My initial idea was to have procs go red for clurgmgrd and yellow on
smtpd, use NOPROPYELLOW, and then use a combo so I could see what host
was active...

 
hosts.cfg:

1.2.3.4   smtp01 # NOPROPYELLOW:procs

1.2.3.5   smtp02 # NOPROPYELLOW:procs

1.2.3.6   smtpHA # smtp

 
analysis.cfg:

HOST=smtp01

            PROC   clurgmgrd

            PROC   smtpd

HOST=smtp02

            PROC   clurgmgrd

            PROC   smtpd

 
combo.cfg:

smtpHA.procs = (smtp01.conn && smtp01.procs) || (smtp02.conn &&
smtp02.procs)

 
Unfortunately a yellow status is equal to a 1 for the combo, so even if
both sides of the cluster were down, the combo would still show as up
(green&&yellow is 1&&1).

 
Then I thought I could cover the scenario by making a "depends" test,
but depends seems to be focused on disabling tests if another status is
red...which doesn't work either since I don't want the page to go red.

 
This isn't a huge issue because the smtp poll on smtpHA will trigger red
if both nodes are down, but it seems like there should be an easy way to
do this.

 
Am I making it more complicated than it needs to be, or am I better
writing a custom monitor?

 
Thanks!

 
-dan


 

Bruce White
Senior Enterprise Systems Engineer | Phone: X-XXX-XXX-XXXX | Fax: XXX-XXX-XXXX | user-58f975e8bf9d@xymon.invalid | http://www.fellowes.com/
 
 
 
Disclaimer: The information contained in this message may be privileged and confidential and protected from disclosure. If the reader of this message is not the intended recipient or an employee or agent responsible for delivering this message to the intended recipient, you are hereby notified that any dissemination, distribution or copying of this communication is strictly prohibited. If you have received this communication in error, please notify us immediately by replying to the message and deleting it from your computer. Thank you. Fellowes, Inc.
list Colin Coe · Fri, 22 Jun 2012 13:51:49 +0800 ·
Crazy thought, but why don't you have smtp01 defined in DNS as an MX with
priority or 10 and smtp02 defined in DNS as an MX with priority of 20.

You may have a perfectly valid reason for doing what you are doing, but
simply having a primary and a secondary SMTP server would be simplest to
administrator and to monitor.

CC
quoted from Bruce White

On Fri, Jun 22, 2012 at 5:18 AM, White, Bruce <user-58f975e8bf9d@xymon.invalid> wrote:
I have a couple of “traditional” active/passive clusters.  For items that
need to be present on all nodes, I sue the standard monitoring available
with xymon and it’s client.  For things that need to only appear on the
active node,  I handle in one of two ways. ****

** **

Way #1 -  I created striped down Xymon clients which runs with the
application within the cluster.  I have HP service Guard clusters, so this
is easy to do within their frame work of scripts handling most everything.
 I create a new client within the disks which migrate within the cluster,
edit the  xymonclient-<O/S>.sh script within this new client directory
structure and assign an IP/host in the hosts.cfg which matches the IP which
floats within the cluster.  I pull all the CPU, memory, etc. stuff out of
the new xymonclient.sh and focus on the disks, procs, ports, etc. which
tend to be very application specific.   I let standard Xymon take care of
the rest.   ****

** **

** **

Way #2 – I setup the IP/Host which matches the floating IP under cluster
control and run scripts from the Xymon server pointing at the floating IP.
 The scripts report status to the Xymon server using the Name  associated
with the floating IP, so to the outside observer it looks like a new
server, but its just an application running where it needs to under cluster
control.  ****

** **

For check just SMTP the second might be the way to go, as you could just
use the standard Xymon SMTP test to see that SMTP is running on the IP
which floats.****

** **

            ……Bruce****

 ****

** **


*Bruce White*

Senior Enterprise Systems Engineer | Phone: X-XXX-XXX-XXXX | Fax:

XXX-XXX-XXXX | user-58f975e8bf9d@xymon.invalid | www.fellowes.com
quoted from Bruce White


**Disclaimer: The information contained in this message may be privileged
and confidential and protected from disclosure. If the reader of this
message is not the intended recipient or an employee or agent responsible
for delivering this message to the intended recipient, you are hereby
notified that any dissemination, distribution or copying of this
communication is strictly prohibited. If you have received this
communication in error, please notify us immediately by replying to the
message and deleting it from your computer. Thank you. Fellowes, Inc.


*From:* xymon-bounces at xymon.com [mailto:xymon-bounces at xymon.com] *On
Behalf Of *Dan Smith
*Sent:* Thursday, June 21, 2012 12:17 PM
*To:* xymon at xymon.com
*Subject:* [Xymon] Monitoring a simple cluster****

** **

Red Hat 5.8 server, 5.5 clients.  Xymon 4.3.7 clients and server. ****

** **

I am trying to put together a quick cluster monitor using depend and
combos, but I think I’m stuck.  Has anyone else done this?****

** **

I have two servers: smtp01 and smtp02 that are in an active/passive
cluster.****

The smtpd process only runs on the server that is active, but there are
other processes that need to be monitored on both hosts (e.g. clurgmgrd).*
***

** **

I would like to go red and have an alert if one of the required processes
goes down on either server, but I only want an alert if the active node of
the cluster has a problem with the smtpd process.****

** **

My initial idea was to have procs go red for clurgmgrd and yellow on
smtpd, use NOPROPYELLOW, and then use a combo so I could see what host was
active…****

** **

hosts.cfg:****

**1.2.3.4   **smtp01 # NOPROPYELLOW:procs****

**1.2.3.5   **smtp02 # NOPROPYELLOW:procs****

**1.2.3.6   **smtpHA # smtp****

** **

analysis.cfg:****

HOST=smtp01****

            PROC   clurgmgrd****

            PROC   smtpd****

HOST=smtp02****

            PROC   clurgmgrd****

            PROC   smtpd****

** **

combo.cfg:****

smtpHA.procs = (smtp01.conn && smtp01.procs) || (smtp02.conn &&
smtp02.procs)****

** **

Unfortunately a yellow status is equal to a 1 for the combo, so even if
both sides of the cluster were down, the combo would still show as up
(green&&yellow is 1&&1).****

** **

Then I thought I could cover the scenario by making a “depends” test, but
depends seems to be focused on disabling tests if another status is
red…which doesn’t work either since I don’t want the page to go red.****

** **

This isn’t a huge issue because the smtp poll on smtpHA will trigger red
if both nodes are down, but it seems like there should be an easy way to do
this.****

** **

Am I making it more complicated than it needs to be, or am I better
writing a custom monitor?****

** **

Thanks!****

** **

-dan****

-- 

RHCE#805007969328369
list Adam Goryachev · Fri, 22 Jun 2012 17:07:55 +1000 ·
quoted from Colin Coe
On 22/06/12 15:51, Colin Coe wrote:
Crazy thought, but why don't you have smtp01 defined in DNS as an MX
with priority or 10 and smtp02 defined in DNS as an MX with priority
of 20.

You may have a perfectly valid reason for doing what you are doing,
but simply having a primary and a secondary SMTP server would be
simplest to administrator and to monitor.
Possibly due to the servers being used as the "smarthost" for random
users sending emails via SMTP?

In any case, it would be interesting to see how this can be done, or how
other people do it, since I've recently implemented a SAN system with an
active + standby and would like to do a very similar thing.

Regards,
Adam

-- 
Adam Goryachev
Website Managers
Ph: +XX X XXXX XXXX                            user-eaec2ffb4cbc@xymon.invalid
Fax: +XX X XXXX XXXX                            www.websitemanagers.com.au
list Dan Smith · Fri, 22 Jun 2012 08:18:18 -0400 ·
Bruce,

If it was just SMTP, I'd be all set with your second suggestion (just forget
the process monitor and do a network test to 25, but as always, it's more
complicated.  There are a few processes that I really have to watch that
don't have external ports to poll.

 
Your first suggestion is a great one, too.  I'm just not positive that
creating a mini-client would be easier than writing a monitor to check the
clustat output and then report on cluster processes.  It might be worth
trying both.

 
Another idea I had is around changing depends to recognize clear or yellow
as a failure (0) instead of a success.   That would let me to return a
yellow or clear on smtpd process down - letting the combo work the way it
should if both nodes were down.

smtpHA.procs = (smtp01.conn && smtp01.procs) || (smtp02.conn &&
smtp02.procs)

= (1 && 0) || (1&&1)

 
As always, thanks for the suggestions - one of them will do the trick!

 
-dan
quoted from Colin Coe

 
From: White, Bruce [mailto:user-58f975e8bf9d@xymon.invalid] 
Sent: Thursday, June 21, 2012 5:18 PM
To: Dan Smith; xymon at xymon.com
Subject: RE: [Xymon] Monitoring a simple cluster

 
I have a couple of "traditional" active/passive clusters.  For items that
need to be present on all nodes, I sue the standard monitoring available
with xymon and it's client.  For things that need to only appear on the
active node,  I handle in one of two ways. 

 
Way #1 -  I created striped down Xymon clients which runs with the
application within the cluster.  I have HP service Guard clusters, so this
is easy to do within their frame work of scripts handling most everything.
I create a new client within the disks which migrate within the cluster,
edit the  xymonclient-<O/S>.sh script within this new client directory
structure and assign an IP/host in the hosts.cfg which matches the IP which
floats within the cluster.  I pull all the CPU, memory, etc. stuff out of
the new xymonclient.sh and focus on the disks, procs, ports, etc. which tend
to be very application specific.   I let standard Xymon take care of the
rest.   

 
Way #2 - I setup the IP/Host which matches the floating IP under cluster
control and run scripts from the Xymon server pointing at the floating IP.
The scripts report status to the Xymon server using the Name  associated
with the floating IP, so to the outside observer it looks like a new server,
but its just an application running where it needs to under cluster control.


For check just SMTP the second might be the way to go, as you could just use
the standard Xymon SMTP test to see that SMTP is running on the IP which
floats.

 
            ..Bruce

 
Bruce White

Senior Enterprise Systems Engineer | Phone: X-XXX-XXX-XXXX | Fax:
XXX-XXX-XXXX | user-58f975e8bf9d@xymon.invalid | www.fellowes.com
<http://www.fellowes.com/>; 

 
Disclaimer: The information contained in this message may be privileged and
confidential and protected from disclosure. If the reader of this message is
not the intended recipient or an employee or agent responsible for
delivering this message to the intended recipient, you are hereby notified
that any dissemination, distribution or copying of this communication is
strictly prohibited. If you have received this communication in error,
please notify us immediately by replying to the message and deleting it from
your computer. Thank you. Fellowes, Inc.

 
From: xymon-bounces at xymon.com [mailto:xymon-bounces at xymon.com] On Behalf Of
Dan Smith
Sent: Thursday, June 21, 2012 12:17 PM
To: xymon at xymon.com
Subject: [Xymon] Monitoring a simple cluster

 
Red Hat 5.8 server, 5.5 clients.  Xymon 4.3.7 clients and server. 

 
I am trying to put together a quick cluster monitor using depend and combos,
but I think I'm stuck.  Has anyone else done this?

 
I have two servers: smtp01 and smtp02 that are in an active/passive cluster.

The smtpd process only runs on the server that is active, but there are
other processes that need to be monitored on both hosts (e.g. clurgmgrd).

 
I would like to go red and have an alert if one of the required processes
goes down on either server, but I only want an alert if the active node of
the cluster has a problem with the smtpd process.

 
My initial idea was to have procs go red for clurgmgrd and yellow on smtpd,
use NOPROPYELLOW, and then use a combo so I could see what host was active.

 
hosts.cfg:

1.2.3.4    smtp01 # NOPROPYELLOW:procs

1.2.3.5    smtp02 # NOPROPYELLOW:procs

1.2.3.6    smtpHA # smtp

 
analysis.cfg:

HOST=smtp01

            PROC   clurgmgrd

            PROC   smtpd

HOST=smtp02

            PROC   clurgmgrd

            PROC   smtpd

 
combo.cfg:                                               

smtpHA.procs = (smtp01.conn && smtp01.procs) || (smtp02.conn &&
smtp02.procs)

 
Unfortunately a yellow status is equal to a 1 for the combo, so even if both
sides of the cluster were down, the combo would still show as up
(green&&yellow is 1&&1).

 
Then I thought I could cover the scenario by making a "depends" test, but
depends seems to be focused on disabling tests if another status is
red.which doesn't work either since I don't want the page to go red.

 
This isn't a huge issue because the smtp poll on smtpHA will trigger red if
both nodes are down, but it seems like there should be an easy way to do
this.

 
Am I making it more complicated than it needs to be, or am I better writing
a custom monitor?

 
Thanks!

 
-dan
list Dan Smith · Fri, 22 Jun 2012 08:28:20 -0400 ·
Collin, you're right - if it was just SMTP, that would be a great solution.
I was using SMTP as an example service, but there are others in the
clustered group that need to move, too.

Adam's point is exactly why I thought it would be good to throw the question
out to the group... this exact situation has come up a bunch of times with
our implementation of bb/hobbit/xymon, and every time the solution was an
external cluster monitor script.  The script checked the role of the server
and reported data back from the active host using the hostname of an HA
"host" (smtpHA in the example).  

I was thinking was that if there is enough traction, we might come up with a
group consensus of how xymon could handle this internally.

-dan

-----Original Message-----
From: xymon-bounces at xymon.com [mailto:xymon-bounces at xymon.com] On Behalf Of
Adam Goryachev
Sent: Friday, June 22, 2012 3:08 AM
To: xymon at xymon.com
quoted from Adam Goryachev
Subject: Re: [Xymon] Monitoring a simple cluster

On 22/06/12 15:51, Colin Coe wrote:
Crazy thought, but why don't you have smtp01 defined in DNS as an MX
with priority or 10 and smtp02 defined in DNS as an MX with priority
of 20.

You may have a perfectly valid reason for doing what you are doing,
but simply having a primary and a secondary SMTP server would be
simplest to administrator and to monitor.
Possibly due to the servers being used as the "smarthost" for random
users sending emails via SMTP?

In any case, it would be interesting to see how this can be done, or how
other people do it, since I've recently implemented a SAN system with an
active + standby and would like to do a very similar thing.

Regards,
Adam

-- 
Adam Goryachev
Website Managers
Ph: +XX X XXXX XXXX                            user-eaec2ffb4cbc@xymon.invalid
Fax: +XX X XXXX XXXX                            www.websitemanagers.com.au