Xymon Mailing List Archive search

Windows Cluster Monitoring Advice

11 messages in this thread

list Padraig Lennon · Tue, 3 Jun 2008 13:13:56 +0200 ·
Hi everyone,
 
I have a requirement to monitor some 2 node Windows 2003 File Clusters .
I would like to monitor them with BBWin but I have the following
questions:
 

*	Should I install the bbwin client on both nodes and activate
only one of the nodes for monitoring? With this setup I miss monitoring
the local C/D drives on each node.
*	Is it possible to make bbwin a clustered application? (i.e. one
that would failover using the MS clustering mechanism)
*	The cluster.exe supplied with the bbwin package only shows the
cluster groups, not the individual items within the groups. Has anyone
written any extended cluster checks that they would possibly share with
me? If not I can write my own. I'm sure I can get this information with
WMI. 

At the moment I setup the bbwin client on one of the nodes and then
change the registry entry to use the cluster name (i.e I install on a
node fileclus1n1 and change the registry key to fileclus1). Is this the
correct way to setup the monitoring? Any advice would be greatly
appreciated.
 
I realise the question should probably be on the BBWIN mailing list, but
I thought I would have more chance of getting a reply on this list.
 
 
Thanks in advance
 
 
Padraig Lennon
Senior Systems Engineer
Production Services
Pioneer Global Investments (Dublin)
5th Floor Georges Quay Plaza, Dublin 2
ext: XXXX
Direct dial: 00353 1 480 2081
list Buchan Milne · Tue, 3 Jun 2008 14:56:56 +0200 ·
quoted from Padraig Lennon
On Tuesday 03 June 2008 13:13:56 Lennon, Padraig wrote:
Hi everyone,

I have a requirement to monitor some 2 node Windows 2003 File Clusters .
I would like to monitor them with BBWin but I have the following
questions:


*	Should I install the bbwin client on both nodes and activate
only one of the nodes for monitoring? With this setup I miss monitoring
the local C/D drives on each node.
*	Is it possible to make bbwin a clustered application? (i.e. one
that would failover using the MS clustering mechanism)
*	The cluster.exe supplied with the bbwin package only shows the
cluster groups, not the individual items within the groups. Has anyone
written any extended cluster checks that they would possibly share with
me? If not I can write my own. I'm sure I can get this information with
WMI.

At the moment I setup the bbwin client on one of the nodes and then
change the registry entry to use the cluster name (i.e I install on a
node fileclus1n1 and change the registry key to fileclus1). Is this the
correct way to setup the monitoring? Any advice would be greatly
appreciated.

I realise the question should probably be on the BBWIN mailing list, but
I thought I would have more chance of getting a reply on this list.
How you monitor clustered nodes does not depend on the monitoring software (or 
even the OS).

What we typically do is monitor the infrastructure aspects (disk, cpu, memory, 
cluster middleware) on all cluster nodes, and the clustered services (e.g. 
database, IP addresses, applications etc.) independently (e.g. with the 
cluster group name).

I am doing this with Windows, Solaris and Linux clusters.

Regards,
Buchan
list Padraig Lennon · Tue, 3 Jun 2008 18:03:25 +0200 ·
Hi Buchan,

Could you give me an example of how you would monitor cluster resources
on the Windows side using the cluster group? What setup would you need?
Do you use the cluster.exe that is bundled with bbwin?

Thanks for your help
signature


Padraig Lennon
Senior Systems Engineer
Production Services
Pioneer Global Investments (Dublin)
5th Floor Georges Quay Plaza, Dublin 2
ext: XXXX
Direct dial: 00353 1 480 2081

-----Original Message-----

quoted from Buchan Milne
From: Buchan Milne [mailto:user-9b139aff4dec@xymon.invalid] 
Sent: 03 June 2008 13:57
To: user-ae9b8668bcde@xymon.invalid
Cc: Lennon, Padraig
Subject: Re: [hobbit] Windows Cluster Monitoring Advice

On Tuesday 03 June 2008 13:13:56 Lennon, Padraig wrote:
Hi everyone,

I have a requirement to monitor some 2 node Windows 2003 File Clusters
.
I would like to monitor them with BBWin but I have the following
questions:


*	Should I install the bbwin client on both nodes and activate
only one of the nodes for monitoring? With this setup I miss
monitoring
the local C/D drives on each node.
*	Is it possible to make bbwin a clustered application? (i.e. one
that would failover using the MS clustering mechanism)
*	The cluster.exe supplied with the bbwin package only shows the
cluster groups, not the individual items within the groups. Has anyone
written any extended cluster checks that they would possibly share
with
me? If not I can write my own. I'm sure I can get this information
with
WMI.

At the moment I setup the bbwin client on one of the nodes and then
change the registry entry to use the cluster name (i.e I install on a
node fileclus1n1 and change the registry key to fileclus1). Is this
the
correct way to setup the monitoring? Any advice would be greatly
appreciated.

I realise the question should probably be on the BBWIN mailing list,
but
I thought I would have more chance of getting a reply on this list.
How you monitor clustered nodes does not depend on the monitoring
software (or 
even the OS).

What we typically do is monitor the infrastructure aspects (disk, cpu,
memory, 
cluster middleware) on all cluster nodes, and the clustered services
(e.g. 
database, IP addresses, applications etc.) independently (e.g. with the 
cluster group name).

I am doing this with Windows, Solaris and Linux clusters.

Regards,
Buchan

-------------- next part --------------


"The information in this e-mail and in any attachments is confidential and intended solely 
for the attention and use of the named addressee(s). This information may be subject to legal, 
professional or other privilege and further distribution of it is strictly prohibited without 
our authority. If you are not the intended recipient, you are not authorised to and must not 
disclose, copy, distribute, or retain this message or any part of it, and should notify us 
immediately.

This footnote also confirms that this email has been automatically scanned for the presence 
of computer viruses, profanities and certain file types."

Pioneer Investment Management Limited.

1 George’s Quay Plaza, George’s Quay, Dublin 2, Ireland. 

Registered in Ireland no. 287793.
list Etienne Grignon · Tue, 3 Jun 2008 23:21:05 +0200 ·
Hello Padraig,

2008/6/3 Lennon, Padraig <user-7738cfcc6ae0@xymon.invalid>:
quoted from Padraig Lennon
Hi everyone,

I have a requirement to monitor some 2 node Windows 2003 File Clusters . I
would like to monitor them with BBWin but I have the following questions:


Should I install the bbwin client on both nodes and activate only one of the
nodes for monitoring? With this setup I miss monitoring the local C/D drives
on each node.
Is it possible to make bbwin a clustered application? (i.e. one that would
failover using the MS clustering mechanism)
The cluster.exe supplied with the bbwin package only shows the cluster
groups, not the individual items within the groups. Has anyone written any
extended cluster checks that they would possibly share with me? If not I can
write my own. I'm sure I can get this information with WMI.

At the moment I setup the bbwin client on one of the nodes and then change
the registry entry to use the cluster name (i.e I install on a node
fileclus1n1 and change the registry key to fileclus1). Is this the correct
way to setup the monitoring? Any advice would be greatly appreciated.

I realise the question should probably be on the BBWIN mailing list, but I
thought I would have more chance of getting a reply on this list.
This is how I monitor MSCS clusters. I install BBWin on both nodes. I
also install the cluster.exe external to be run on both nodes.

The cluster.exe external will check that every resource of the MSCS
cluster is online. If the node switch to the other node, you will get
an alarm on it until you acknowledge the alarm by deleting a temporary
file on each node.

I hope it will be enough for you.

Regards,


-- 
Etienne GRIGNON
list Padraig Lennon · Thu, 5 Jun 2008 11:10:23 +0200 ·
Thanks Etienne,

I have implemented those changes.. All look good. How do you deal with
event log errors? I was thinking a combo test would work..

A few other issues:

Say I wanted to monitor a shared disk F: on the cluster. The shared
drive is 1TB in size. For the moment the disk is on node1 of the
cluster. Now I want to alert only when the disk gets to 50gb left. This
is easy to do in the bbwin.cfg file on node1. 

Suppose now we have a failover of the resource to node2. It has no idea
about the 50gb limit and back on node1 it is in an alert status because
it can't find the F: drive.

How do I get around this?
quoted from Etienne Grignon

Thanks for your help
 

Padraig Lennon
Senior Systems Engineer
Production Services
Pioneer Global Investments (Dublin)
5th Floor Georges Quay Plaza, Dublin 2
ext: XXXX
Direct dial: 00353 1 480 2081

-----Original Message-----
From: Etienne Grignon [mailto:user-87c74c1037a4@xymon.invalid] 
Sent: 03 June 2008 22:21
To: user-ae9b8668bcde@xymon.invalid
Subject: Re: [hobbit] Windows Cluster Monitoring Advice

Hello Padraig,

2008/6/3 Lennon, Padraig <user-7738cfcc6ae0@xymon.invalid>:
Hi everyone,

I have a requirement to monitor some 2 node Windows 2003 File Clusters
. I
would like to monitor them with BBWin but I have the following
questions:


Should I install the bbwin client on both nodes and activate only one
of the
nodes for monitoring? With this setup I miss monitoring the local C/D
drives
on each node.
Is it possible to make bbwin a clustered application? (i.e. one that
would
failover using the MS clustering mechanism)
The cluster.exe supplied with the bbwin package only shows the cluster
groups, not the individual items within the groups. Has anyone written
any
extended cluster checks that they would possibly share with me? If not
I can
write my own. I'm sure I can get this information with WMI.

At the moment I setup the bbwin client on one of the nodes and then
change
the registry entry to use the cluster name (i.e I install on a node
fileclus1n1 and change the registry key to fileclus1). Is this the
correct
way to setup the monitoring? Any advice would be greatly appreciated.

I realise the question should probably be on the BBWIN mailing list,
but I
thought I would have more chance of getting a reply on this list.
This is how I monitor MSCS clusters. I install BBWin on both nodes. I
also install the cluster.exe external to be run on both nodes.

The cluster.exe external will check that every resource of the MSCS
cluster is online. If the node switch to the other node, you will get
an alarm on it until you acknowledge the alarm by deleting a temporary
file on each node.

I hope it will be enough for you.

Regards,


-- 
Etienne GRIGNON
list Etienne Grignon · Wed, 11 Jun 2008 11:09:29 +0200 ·
Hi Padraig,

2008/6/5 Lennon, Padraig <user-7738cfcc6ae0@xymon.invalid>:
quoted from Padraig Lennon
Thanks Etienne,

I have implemented those changes.. All look good. How do you deal with event
log errors? I was thinking a combo test would work..

A few other issues:

Say I wanted to monitor a shared disk F: on the cluster. The shared drive is
1TB in size. For the moment the disk is on node1 of the cluster. Now I want
to alert only when the disk gets to 50gb left. This is easy to do in the
bbwin.cfg file on node1.

Suppose now we have a failover of the resource to node2. It has no idea
about the 50gb limit and back on node1 it is in an alert status because it
can't find the F: drive.

How do I get around this?
You will have to comment the line in bbwin.cfg on node 2 until the
second node becomes the active node. It is a manual action which is
not a good idea I know but there are no other alternatives for that.
However, you can try to remove the specific F rule and change the
default rules in % to be sure you will always have 50g left on your F:
drive so you won't get alerts on the second node because the F: drive
is missing.

Regards,


-- 
Etienne GRIGNON
list Padraig Lennon · Thu, 12 Jun 2008 12:50:07 +0200 ·
Thanks Etienne,


Padraig Lennon
Senior Systems Engineer
Production Services
Pioneer Global Investments (Dublin)
5th Floor Georges Quay Plaza, Dublin 2
ext: XXXX
Direct dial: 00353 1 480 2081

-----Original Message-----
From: Etienne Grignon [mailto:user-87c74c1037a4@xymon.invalid] Sent: 11 June 2008 10:09
To: user-ae9b8668bcde@xymon.invalid
Subject: Re: [hobbit] Windows Cluster Monitoring Advice

Hi Padraig,

2008/6/5 Lennon, Padraig <user-7738cfcc6ae0@xymon.invalid>:
Thanks Etienne,

I have implemented those changes.. All look good. How do you deal with
event
log errors? I was thinking a combo test would work..

A few other issues:

Say I wanted to monitor a shared disk F: on the cluster. The shared
drive is
1TB in size. For the moment the disk is on node1 of the cluster. Now I
want
to alert only when the disk gets to 50gb left. This is easy to do in
the
bbwin.cfg file on node1.

Suppose now we have a failover of the resource to node2. It has no
idea
about the 50gb limit and back on node1 it is in an alert status
because it
can't find the F: drive.

How do I get around this?
You will have to comment the line in bbwin.cfg on node 2 until the
second node becomes the active node. It is a manual action which is
not a good idea I know but there are no other alternatives for that.
However, you can try to remove the specific F rule and change the
default rules in % to be sure you will always have 50g left on your F:
drive so you won't get alerts on the second node because the F: drive
is missing.

Regards,


-- 
Etienne GRIGNON


-------------- next part --------------


"The information in this e-mail and in any attachments is confidential and intended solely for the attention and use of the named addressee(s). This information may be subject to legal, professional or other privilege and further distribution of it is strictly prohibited without our authority. If you are not the intended recipient, you are not authorised to and must not disclose, copy, distribute, or retain this message or any part of it, and should notify us immediately.

This footnote also confirms that this email has been automatically scanned for the presence of computer viruses, profanities and certain file types."

Pioneer Investment Management Limited.

1 George’s Quay Plaza, George’s Quay, Dublin 2, Ireland. 
Registered in Ireland no. 287793.
list Aaron Zink · Thu, 12 Jun 2008 17:05:48 -0700 ·
Hello,

I too am using bbwin extensively to monitor our windows environment, and we have several clusters.  In a windows cluster, monitoring some services per-client will work (especially for tcp monitors), but it is not an ideal solution for several reasons:

1. Active-passive clusters have services and ports that will be running on one node but not the other, making these impossible to monitor.  Bbcombotest can sort of be used, but it does not work very well for this.

2. I have yet to get the file checks to work, but checking a file on a shared drive wouldn't work

3. externals.exe to monitor the cluster is nice but there are times when the cluster is "fine" according to cluster manager, but a shared disk is not accessible.

I had an idea to monitor clusters, and was wondering about the feasibility:  Re-add the HOSTNAME configuration entry into the bbwin.cfg file, and run two instances of bbwin.exe on the client.  One would be the default (reading the hostname from the machine), and the other would be manually configured in the .cfg to the cluster name.  This is currently not possible because the hostname can only be overridden in the registry, where both bbwin instances reference.

I don't have a development environment set up to test this myself, but in theory it should work.


Aaron Zink
Corporate IT Manager
eHarmony.com
XXX.XXX.XXXX


- Aaron Zink
quoted from Padraig Lennon


-----Original Message-----
From: Lennon, Padraig [mailto:user-7738cfcc6ae0@xymon.invalid]
Sent: Thursday, June 12, 2008 03:50
To: user-ae9b8668bcde@xymon.invalid
Subject: RE: [hobbit] Windows Cluster Monitoring Advice

Thanks Etienne,


Padraig Lennon
Senior Systems Engineer
Production Services
Pioneer Global Investments (Dublin)
5th Floor Georges Quay Plaza, Dublin 2
ext: XXXX
Direct dial: 00353 1 480 2081

-----Original Message-----
From: Etienne Grignon [mailto:user-87c74c1037a4@xymon.invalid]
Sent: 11 June 2008 10:09
To: user-ae9b8668bcde@xymon.invalid
Subject: Re: [hobbit] Windows Cluster Monitoring Advice

Hi Padraig,

2008/6/5 Lennon, Padraig <user-7738cfcc6ae0@xymon.invalid>:
Thanks Etienne,

I have implemented those changes.. All look good. How do you deal with
event
log errors? I was thinking a combo test would work..

A few other issues:

Say I wanted to monitor a shared disk F: on the cluster. The shared
drive is
1TB in size. For the moment the disk is on node1 of the cluster. Now I
want
to alert only when the disk gets to 50gb left. This is easy to do in
the
bbwin.cfg file on node1.

Suppose now we have a failover of the resource to node2. It has no
idea
about the 50gb limit and back on node1 it is in an alert status
because it
can't find the F: drive.

How do I get around this?
You will have to comment the line in bbwin.cfg on node 2 until the second node becomes the active node. It is a manual action which is not a good idea I know but there are no other alternatives for that.
However, you can try to remove the specific F rule and change the default rules in % to be sure you will always have 50g left on your F:
drive so you won't get alerts on the second node because the F: drive is missing.

Regards,


--
Etienne GRIGNON
list Aaron Zink · Mon, 7 Jul 2008 17:43:35 -0700 ·
Has anyone had any thoughts on this?  It is really the only thing lacking in our Windows monitoring environment.

Simply re-introducing the optional hostname directive in bbwin and running two instances on each host seems like it would work.  Bbwin would first check the .cfg, then the registry, then default to the machine hostname.


Aaron Zink
Manager, Corporate IT
eHarmony.com
XXX.XXX.XXXX
quoted from Aaron Zink

-----Original Message-----
From: Aaron Zink [mailto:user-d721f5a4f642@xymon.invalid]
Sent: Thursday, June 12, 2008 17:06
To: user-ae9b8668bcde@xymon.invalid
Subject: RE: RE: [hobbit] Windows Cluster Monitoring Advice

Hello,

I too am using bbwin extensively to monitor our windows environment, and we have several clusters.  In a windows cluster, monitoring some services per-client will work (especially for tcp monitors), but it is not an ideal solution for several reasons:

1. Active-passive clusters have services and ports that will be running on one node but not the other, making these impossible to monitor.  Bbcombotest can sort of be used, but it does not work very well for this.

2. Checking a file on a shared drive wouldn't work.

3. externals.exe to monitor the cluster is nice but there are times when the cluster is "fine" according to cluster manager, but a shared disk is not accessible.

I had an idea to monitor clusters, and was wondering about the feasibility:  Re-add the HOSTNAME configuration entry into the bbwin.cfg file, and run two instances of bbwin.exe on the client.  One would be the default (reading the hostname from the machine), and the other would be manually configured in the .cfg to the cluster name.  This is currently not possible because the hostname can only be overridden in the registry, where both bbwin instances reference.

I don't have a development environment set up to test this myself, but in theory it should work.


Aaron Zink
Corporate IT Manager
eHarmony.com
XXX.XXX.XXXX


- Aaron Zink


-----Original Message-----
From: Lennon, Padraig [mailto:user-7738cfcc6ae0@xymon.invalid]
Sent: Thursday, June 12, 2008 03:50
To: user-ae9b8668bcde@xymon.invalid
Subject: RE: [hobbit] Windows Cluster Monitoring Advice

Thanks Etienne,


Padraig Lennon
Senior Systems Engineer
Production Services
Pioneer Global Investments (Dublin)
5th Floor Georges Quay Plaza, Dublin 2
ext: XXXX
Direct dial: 00353 1 480 2081

-----Original Message-----
From: Etienne Grignon [mailto:user-87c74c1037a4@xymon.invalid]
Sent: 11 June 2008 10:09
To: user-ae9b8668bcde@xymon.invalid
Subject: Re: [hobbit] Windows Cluster Monitoring Advice

Hi Padraig,

2008/6/5 Lennon, Padraig <user-7738cfcc6ae0@xymon.invalid>:
Thanks Etienne,

I have implemented those changes.. All look good. How do you deal with
event
log errors? I was thinking a combo test would work..

A few other issues:

Say I wanted to monitor a shared disk F: on the cluster. The shared
drive is
1TB in size. For the moment the disk is on node1 of the cluster. Now I
want
to alert only when the disk gets to 50gb left. This is easy to do in
the
bbwin.cfg file on node1.

Suppose now we have a failover of the resource to node2. It has no
idea
about the 50gb limit and back on node1 it is in an alert status
because it
can't find the F: drive.

How do I get around this?
You will have to comment the line in bbwin.cfg on node 2 until the second node becomes the active node. It is a manual action which is not a good idea I know but there are no other alternatives for that.
However, you can try to remove the specific F rule and change the default rules in % to be sure you will always have 50g left on your F:
drive so you won't get alerts on the second node because the F: drive is missing.

Regards,


--
Etienne GRIGNON
list Heinelt Maik · Tue, 08 Jul 2008 10:54:22 +0900 ·
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

I have asked this questions some weeks ago, too, but nobody answered.

I would like to know, why the uptime only will be displayed for Windows servers?
The color mark is only set for Windows servers, for Linux machines, I can see the status and the uptime counter at the log, only.

I also would like to get displayed the uptime of Linux and all other non Windows machines, too.

May now someone can help me with this question ?!

Regards


Maik
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFIcsjOr4r+EhimPOURAnIUAJ9nhKWMrPVMZW58QQgOJhMKrGrAVQCgsTyk
tV1sSQa86VauWsPPxaT+YRI=
=H6AI
-----END PGP SIGNATURE-----
list Gatis A. · Tue, 8 Jul 2008 08:54:16 +0300 ·
For linux systems uptime is shown under "cpu".
quoted from Heinelt Maik

On Tue, Jul 8, 2008 at 4:54 AM, Heinelt Maik <user-4ab5eb34adb2@xymon.invalid> wrote:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

I have asked this questions some weeks ago, too, but nobody answered.

I would like to know, why the uptime only will be displayed for Windows
servers?
The color mark is only set for Windows servers, for Linux machines, I can
see the status and the uptime counter at the log, only.

I also would like to get displayed the uptime of Linux and all other non
Windows machines, too.

May now someone can help me with this question ?!

Regards


Maik
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFIcsjOr4r+EhimPOURAnIUAJ9nhKWMrPVMZW58QQgOJhMKrGrAVQCgsTyk
tV1sSQa86VauWsPPxaT+YRI=
=H6AI
-----END PGP SIGNATURE-----