Xymon Mailing List Archive search

custom client script and disable

12 messages in this thread

list Steve · Wed, 09 Nov 2005 12:49:57 -0500 ·
I didn't get any answers on this so I am posting it again.

Hobbit Monitor 4.1.2

I wrote a client script to monitor an application, added to the clientlaunch.cfg file restarted hobbit, the monitor works well

Because the application I am monitoring is faulty I added to the client script a section of code to start up the application should it fail.  I shut down the application and Hobbit starts it up ... Cool.

To avoid having the application start up during the backup or maintenance window I disable the client script. 
I ran a test backup to be sure it worked, but it appears that Hobbit executes the client script even when the test is disabled, and the application goes from blue to red to green.

So now the 64K question ... what can I change so the test does not run when disabled or ... where did I error?

Thanks.
list Henrik Størner · Wed, 9 Nov 2005 21:39:03 +0100 ·
quoted from Steve
On Wed, Nov 09, 2005 at 12:49:57PM -0500, Steve wrote:
I wrote a client script to monitor an application, added to the clientlaunch.cfg file restarted hobbit, the monitor works well

Because the application I am monitoring is faulty I added to the client script a section of code to start up the application should it fail.  I shut down the application and Hobbit starts it up ... Cool.

To avoid having the application start up during the backup or maintenance window I disable the client script. 
I ran a test backup to be sure it worked, but it appears that Hobbit executes the client script even when the test is disabled, and the application goes from blue to red to green.
You're right that Hobbit runs the client script regardless of whether the status that the script reports is disabled or not. hobbitlaunch
which runs the script simply has no idea what status column(s) your script might report.
So now the 64K question ... what can I change so the test does not run when disabled or ... where did I error?
Have your script check if the test is disabled, and abort if it is. Something like this:

    STATUS=`$BB $BBDISP "query $MACHINE.mytest" | awk '{print $1}'`
    if [ "$STATUS" = "blue" ]; then exit 0; fi

The bb(1) man-page has some more examples of how you can use the "query"
command.


Regards,
Henrik
list Steve · Wed, 09 Nov 2005 16:30:41 -0500 ·
Henrik,

I knew I could count on you to help me see the light.  I hadn't even considered checking the status and acting on the result.

Just out of curiosity, Big Brother didn't act this too did it?
quoted from Henrik Størner

Henrik Stoerner wrote:
On Wed, Nov 09, 2005 at 12:49:57PM -0500, Steve wrote:
 
I wrote a client script to monitor an application, added to the clientlaunch.cfg file restarted hobbit, the monitor works well

Because the application I am monitoring is faulty I added to the client script a section of code to start up the application should it fail.  I shut down the application and Hobbit starts it up ... Cool.

To avoid having the application start up during the backup or maintenance window I disable the client script. 
I ran a test backup to be sure it worked, but it appears that Hobbit executes the client script even when the test is disabled, and the application goes from blue to red to green.
   
You're right that Hobbit runs the client script regardless of whether the status that the script reports is disabled or not. hobbitlaunch
which runs the script simply has no idea what status column(s) your script might report.

 
So now the 64K question ... what can I change so the test does not run when disabled or ... where did I error?
   
Have your script check if the test is disabled, and abort if it is. Something like this:

   STATUS=`$BB $BBDISP "query $MACHINE.mytest" | awk '{print $1}'`
   if [ "$STATUS" = "blue" ]; then exit 0; fi

The bb(1) man-page has some more examples of how you can use the "query"
command.


Regards,
Henrik

-- 

Steve DiSorbo
System Programmer
Yale University ITS, AM&T Library Systems
Voice (XXX) XXX-XXXX
Fax   (XXX) XXX-XXXX
user-41ded5dfdc34@xymon.invalid
http://www.library.yale.edu/
list Henrik Størner · Wed, 9 Nov 2005 22:35:34 +0100 ·
Hi Steve,
quoted from Steve

On Wed, Nov 09, 2005 at 04:30:41PM -0500, Steve wrote:
I knew I could count on you to help me see the light.  I hadn't even considered checking the status and acting on the result.

Just out of curiosity, Big Brother didn't act this too did it?
BB would also run the script regardless of whether that status
was disabled or not. And BB doesn't have any way for a client script to check if the status is disabled - I added this in
Hobbit because I needed it for much the same reasons that you do.


Regards,
Henrik
list Steve · Wed, 09 Nov 2005 17:27:36 -0500 ·
I just looked at the history logs on my BB server and see that ... yep 
you're right !!!
not that I didn't believe you, but I was disillusioned.
quoted from Henrik Størner

Henrik Stoerner wrote:
Hi Steve,

On Wed, Nov 09, 2005 at 04:30:41PM -0500, Steve wrote:
 
I knew I could count on you to help me see the light.  I hadn't even 
considered checking the status and acting on the result.

Just out of curiosity, Big Brother didn't act this too did it?
   
BB would also run the script regardless of whether that status
was disabled or not. And BB doesn't have any way for a client 
script to check if the status is disabled - I added this in
Hobbit because I needed it for much the same reasons that you do.


Regards,
Henrik

-- 
Steve DiSorbo
System Programmer
Yale University ITS, AM&T Library Systems
Voice (XXX) XXX-XXXX
Fax   (XXX) XXX-XXXX
user-41ded5dfdc34@xymon.invalid
http://www.library.yale.edu/
list Steve · Thu, 10 Nov 2005 10:51:39 -0500 ·
So ...

I add the code you suggest, query bb for status and test for blue, but it is never blue, even when disabled. 
To be sure that my client script is not changing the status I put an exit statement in the script so the code is not processed. 
Maybe I need to find a better way to restart the application when it fails, or get the vendor to address the problems with it.

Here is what I see.


When the test is enabled the details screen is empty as it should be no test or status is perfromed.

# /usr/local/hobbit/server/bin/bb server "query server.application"
green


Hobbit 	
server - application

	Thu Nov 10 10:35:27 2005


Status unchanged in 0 hours, 51 minutes
Status message received from


 When I disable the test the detail screen show that the status is green.

# /usr/local/hobbit/server/bin/bb server "query server.application"
green


Hobbit 	
server - application

	Thu Nov 10 10:40:24 2005


      Disabled until Thu Nov 10 14:40:24 2005

Disabled by: admin @ xxx.xxx.xxx.xxx
Reason: test
      

Current status message follows:


      green


Status unchanged in 0 hours, 52 minutes
Status message received from
quoted from Steve


Henrik Stoerner wrote:
On Wed, Nov 09, 2005 at 12:49:57PM -0500, Steve wrote:
 
I wrote a client script to monitor an application, added to the clientlaunch.cfg file restarted hobbit, the monitor works well

Because the application I am monitoring is faulty I added to the client script a section of code to start up the application should it fail.  I shut down the application and Hobbit starts it up ... Cool.

To avoid having the application start up during the backup or maintenance window I disable the client script. 
I ran a test backup to be sure it worked, but it appears that Hobbit executes the client script even when the test is disabled, and the application goes from blue to red to green.
   
You're right that Hobbit runs the client script regardless of whether the status that the script reports is disabled or not. hobbitlaunch
which runs the script simply has no idea what status column(s) your script might report.

 
So now the 64K question ... what can I change so the test does not run when disabled or ... where did I error?
   
Have your script check if the test is disabled, and abort if it is. Something like this:

   STATUS=`$BB $BBDISP "query $MACHINE.mytest" | awk '{print $1}'`
   if [ "$STATUS" = "blue" ]; then exit 0; fi

The bb(1) man-page has some more examples of how you can use the "query"
command.


Regards,
Henrik

-- 
Steve DiSorbo
System Programmer
Yale University ITS, AM&T Library Systems
Voice (XXX) XXX-XXXX
Fax   (XXX) XXX-XXXX
user-41ded5dfdc34@xymon.invalid
http://www.library.yale.edu/
list Frédéric Mangeant · Thu, 10 Nov 2005 16:55:55 +0100 ·
quoted from Steve
Steve a écrit :
So ...

I add the code you suggest, query bb for status and test for blue, but 
it is never blue, even when disabled. 
Hi

same here (with Hobbit 4.1.2), it returns the "true" color of the test, 
and not blue :

$ ./bb localhost "query dashboard.conn"
red <!-- [flags:ordAstILe] --> Thu Nov 10 16:53:30 2005 conn NOT ok

It works fine on my bbgen 3.6 server :

$ ./bb localhost "query dashboard.conn"
blue   Wed Oct 26 18:24:17 2005    OFFLINE UNTIL Mon Jul 21 18:24:17 2008

-- 

Frédéric Mangeant

Steria EDC Sophia-Antipolis
list Frédéric Mangeant · Mon, 14 Nov 2005 17:18:41 +0100 ·
quoted from Frédéric Mangeant
Frédéric Mangeant a écrit :
Steve a écrit :
So ...

I add the code you suggest, query bb for status and test for blue, but it is never blue, even when disabled. 
Hi

same here (with Hobbit 4.1.2), it returns the "true" color of the test, and not blue :

$ ./bb localhost "query dashboard.conn"
red <!-- [flags:ordAstILe] --> Thu Nov 10 16:53:30 2005 conn NOT ok

It works fine on my bbgen 3.6 server :

$ ./bb localhost "query dashboard.conn"
blue   Wed Oct 26 18:24:17 2005    OFFLINE UNTIL Mon Jul 21 18:24:17 2008
Hi Henrik

using "hobbitdlog" (from 4.1.2p1) it works fine, the blue color shows up on the first line :


$ ./bb localhost "hobbitdlog dashboard.conn"
dashboard|conn|blue|ordAstILe|1131615360|1131984873|1140168960|0|1140168960|10.50.80.10|-1||Disabled by: unknown @ 10.50.8.55\nReason: Ne répond jamais\n|N
red <!-- [flags:ordAstILe] --> Mon Nov 14 17:14:25 2005 conn NOT ok

Service conn on dashboard is not OK : Host does not respond to ping


System unreachable for 6592 poll periods (1989310 seconds)

&red 10.50.80.2 is unreachable

Regards,

-- 

Frédéric Mangeant

Steria EDC Sophia-Antipolis
list Henrik Størner · Mon, 14 Nov 2005 17:46:11 +0100 ·
quoted from Frédéric Mangeant
On Mon, Nov 14, 2005 at 05:18:41PM +0100, Frédéric Mangeant wrote:
same here (with Hobbit 4.1.2), it returns the "true" color of the 
test, and not blue :

$ ./bb localhost "query dashboard.conn"
red <!-- [flags:ordAstILe] --> Thu Nov 10 16:53:30 2005 conn NOT ok

It works fine on my bbgen 3.6 server :

$ ./bb localhost "query dashboard.conn"
blue   Wed Oct 26 18:24:17 2005    OFFLINE UNTIL Mon Jul 21 18:24:17 2008
using "hobbitdlog" (from 4.1.2p1) it works fine, the blue color shows up 
on the first line :
Eureka - I should have thought of that right away.

When hobbitd responds to a "query" command, it just spits back the 
first line of the last status message it received. But that will be
the "raw" status message - i.e. it will show red, because a disable
is handled through some internal status flags in Hobbit that aren't
reflected in the status message text hobbitd stores.

That is why when you look at the detailed status view for something
that is disabled, you'll see the "true" status below the "Disabled 
until .... " message.

This patch should fix it - against 4.1.2p1.


Regards,
Henrik

-------------- next part --------------
--- hobbitd/hobbitd.c.orig	2005-11-10 16:57:35.000000000 +0100
+++ hobbitd/hobbitd.c	2005-11-14 17:44:32.874945066 +0100
@@ -1978,11 +1978,23 @@
 			xfree(msg->buf);
 			msg->doingwhat = RESPONDING;
 			if (log->message) {
-				unsigned char *eoln;
• -				eoln = strchr(log->message, '\n'); if (eoln) *eoln = '\0';
-				msg->buf = msg->bufp = strdup(msg_data(log->message));
+				unsigned char *bol, *eoln;
+				int msgcol;
+				char response[500];
• +				bol = msg_data(log->message);
+				msgcol = parse_color(bol);
+				if (msgcol != -1) {
+					/* Skip the color - it may be different in real life */
+					bol += strlen(colorname(msgcol));
+					bol += strspn(bol, " \t");
+				}
+				eoln = strchr(bol, '\n'); if (eoln) *eoln = '\0';
+				snprintf(response, sizeof(response), "%s %s\n", colorname(log->color), bol);
+				response[sizeof(response)-1] = '\0';
 				if (eoln) *eoln = '\n';
• +				msg->buf = msg->bufp = strdup(response);
 				msg->buflen = strlen(msg->buf);
 			}
 			else {
list Frédéric Mangeant · Tue, 15 Nov 2005 09:51:06 +0100 ·
Henrik Stoerner a écrit :
This patch should fix it - against 4.1.2p1.
  
Thanks, with this patch it works fine :

$ ./bb localhost "query dashboard.conn"
blue <!-- [flags:ordAstILe] --> Tue Nov 15 09:49:09 2005 conn NOT ok

-- 

Frédéric Mangeant

Steria EDC Sophia-Antipolis
list Allan Marillier · Wed, 30 Nov 2005 11:40:37 -0500 ·
We are undergoing some restructuring among support teams, and are separating and defining some responsibilities a little better. System admins will be called on system specific problems, while DBAs and apps support people will be called on others. One of my frequent complaints in the past has been getting calls on filled file systems containing Oracle trace files, dumps, logs, temporary files etc. that I don't feel I should be making decisions to remove, gzip etc. 
That is changing - DBAs will be called on those file systems in other environments where we're using HP's ITO (aka Vantage Point / OpenView, call it what you will).

I want to try to do the same on the servers I am monitoring with hobbit. Is there any way, or a possibility of a feature request that would allow me to specify alerts to me for all filesystems, other than a few specific ones? E.g. using a modified version of the hobbit-alerts page example:

HOST=www.foo.com
         MAIL user-d361980bf330@xymon.invalid SERVICE=disk.ora-db,disk.ora-idx,disk.ora-log REPEAT=1h          MAIL user-09a89c618369@xymon.invalid SERVICE=cpu,disk,memory
list Craig Whilding · Wed, 30 Nov 2005 16:59:26 -0000 ·
On a similar note to this, I've just been looking at monitoring certain
processes or services which will need a different alert to others on the
same box. I don't want to have to write separate scripts to monitor each
service as the clients already check the status of each, I was just
wondering if it was possible server side to split the alerting of each
proc/service or allow the check to be redirected to another page.

 
An example of what I mean is that we check the virus scanner is running
on all systems but don't care about this as much as if our clearcase
server processes stopped working. 

 
Thanks,

Craig Whilding
quoted from Allan Marillier

 
From: user-e3a6ebbee6cd@xymon.invalid [mailto:user-e3a6ebbee6cd@xymon.invalid] 
Sent: 30 November 2005 16:41
To: user-ae9b8668bcde@xymon.invalid
Subject: [hobbit] Disk monitoring and alerts

 
We are undergoing some restructuring among support teams, and are
separating and defining some responsibilities a little better. System
admins will be called on system specific problems, while DBAs and apps
support people will be called on others. One of my frequent complaints
in the past has been getting calls on filled file systems containing
Oracle trace files, dumps, logs, temporary files etc. that I don't feel
I should be making decisions to remove, gzip etc. 

That is changing - DBAs will be called on those file systems in other
environments where we're using HP's ITO (aka Vantage Point / OpenView,
call it what you will). 

I want to try to do the same on the servers I am monitoring with hobbit.
Is there any way, or a possibility of a feature request that would allow
me to specify alerts to me for all filesystems, other than a few
specific ones? E.g. using a modified version of the hobbit-alerts page
example: 

HOST=www.foo.com
        MAIL user-d361980bf330@xymon.invalid SERVICE=disk.ora-db,disk.ora-idx,disk.ora-log
REPEAT=1h 
        MAIL user-09a89c618369@xymon.invalid SERVICE=cpu,disk,memory