Xymon Mailing List Archive search

Help writing test -- sar

4 messages in this thread

list James Wade · Thu, 10 Jul 2008 10:44:52 -0500 ·
Hello All,
 
I've written my own "sar" test to put sar cpu data
into the database. It's been working fine now for a while.
I don't do any alerting, it's just graphing sar data.
 
However, a recent problem came up. Hobbit uses cpu load
for alerting on high cpu load. However, using cpu load doesn't
take into account i/o wait problems. The cpu may have a low
load on it, but have excessive i/o wait, so as an example, a
24 cpu box, with a load of 12 on it may actually be at 99% utilization.
 
I've seen this on database servers where someone writes a
bad sql query to access the database, or an index file isn't
there. So I want to alert on cpu utilization in addition to cpu load.
 
I want to take my sar test, output is below, that I've written and
have it check for idle time less than 5% for alerting. I've already
modified my test for this.
 
However, when I was orignally writing the sar test to put into the
rrd database, if I included any kind of status message, it would not put
data into the database.
 
What's the format so that it keeps putting data into the database
even if I have a status message? or am I wrong on this?
 
Example:
 
Here's the output of my sar test now:
 
Thu Jul 10 10:35:20  CDT 2008
usr :  3
sys : 1
wio : 1
idle :  95
 
**** GRAPH *****

What I want is something like this if idle is less than 5%:

Thu Jul 10 10:40:30 CDT 2008
* CPU usage is above 95%
user : 40
sys :  10
wio :   47
idle : 3
 
*** GRAPH ****
 
So, if I add to the ouput the CPU usage is above 95%
will I continue to write to the database the stats?
Last time I tried adding status messages like above,
it didn't seem to put anything in the database.

THanks....James
list James Wade · Thu, 10 Jul 2008 15:48:43 -0500 ·
If the test turns red, it stops logging data
even with no status message.
 
Henrik, if I have my own test using NCV,
and the test goes red, is there a reason
it would stop putting data current NCV data
in the RRD database?
 
Any help would be appreciated, or is there a
way that hobbit can alert on cpu usage instead
of cpu load (or both).
 
Thanks....James
quoted from James Wade


From: James Wade [mailto:user-659655b2ea05@xymon.invalid] 
Sent: Thursday, July 10, 2008 10:45 AM
To: user-ae9b8668bcde@xymon.invalid
Subject: [hobbit] Help writing test -- sar


Hello All,
 
I've written my own "sar" test to put sar cpu data
into the database. It's been working fine now for a while.
I don't do any alerting, it's just graphing sar data.
 
However, a recent problem came up. Hobbit uses cpu load
for alerting on high cpu load. However, using cpu load doesn't
take into account i/o wait problems. The cpu may have a low
load on it, but have excessive i/o wait, so as an example, a
24 cpu box, with a load of 12 on it may actually be at 99% utilization.
 
I've seen this on database servers where someone writes a
bad sql query to access the database, or an index file isn't
there. So I want to alert on cpu utilization in addition to cpu load.
 
I want to take my sar test, output is below, that I've written and
have it check for idle time less than 5% for alerting. I've already
modified my test for this.
 
However, when I was orignally writing the sar test to put into the
rrd database, if I included any kind of status message, it would not put
data into the database.
 
What's the format so that it keeps putting data into the database
even if I have a status message? or am I wrong on this?
 
Example:
 
Here's the output of my sar test now:
 
Thu Jul 10 10:35:20  CDT 2008
usr :  3
sys : 1
wio : 1
idle :  95
 
**** GRAPH *****

What I want is something like this if idle is less than 5%:

Thu Jul 10 10:40:30 CDT 2008
* CPU usage is above 95%
user : 40
sys :  10
wio :   47
idle : 3
 
*** GRAPH ****
 
So, if I add to the ouput the CPU usage is above 95%
will I continue to write to the database the stats?
Last time I tried adding status messages like above,
it didn't seem to put anything in the database.

THanks....James
list Alan Sparks · Thu, 10 Jul 2008 20:22:31 -0600 ·
I' ve done this before, although I tend to do the custom extra-script 
solution anymore, since it is more powerful.  However, I've never had a 
problem with a status message or particular color stopping graphing from 
working.  Only obvious idea at the moment is, do you in any way change 
the line structure of the reported values, in such a way that the lines 
would not parse correctly after adding the status message?
-Alan
quoted from James Wade

James Wade wrote:
If the test turns red, it stops logging data
even with no status message.
 
Henrik, if I have my own test using NCV,
and the test goes red, is there a reason
it would stop putting data current NCV data
in the RRD database?
 
Any help would be appreciated, or is there a
way that hobbit can alert on cpu usage instead
of cpu load (or both).
 
Thanks....James

*From:* James Wade [mailto:user-659655b2ea05@xymon.invalid]
*Sent:* Thursday, July 10, 2008 10:45 AM
*To:* user-ae9b8668bcde@xymon.invalid
*Subject:* [hobbit] Help writing test -- sar

Hello All,
 
I've written my own "sar" test to put sar cpu data
into the database. It's been working fine now for a while.
I don't do any alerting, it's just graphing sar data.
 
However, a recent problem came up. Hobbit uses cpu load
for alerting on high cpu load. However, using cpu load doesn't
take into account i/o wait problems. The cpu may have a low
load on it, but have excessive i/o wait, so as an example, a
24 cpu box, with a load of 12 on it may actually be at 99% utilization.
 
I've seen this on database servers where someone writes a
bad sql query to access the database, or an index file isn't
there. So I want to alert on cpu utilization in addition to cpu load.
 
I want to take my sar test, output is below, that I've written and
have it check for idle time less than 5% for alerting. I've already
modified my test for this.
 
However, when I was orignally writing the sar test to put into the
rrd database, if I included any kind of status message, it would not put
data into the database.
 
What's the format so that it keeps putting data into the database
even if I have a status message? or am I wrong on this?
 
Example:
 
Here's the output of my sar test now:
 
Thu Jul 10 10:35:20  CDT 2008
usr :  3
sys : 1
wio : 1
idle :  95
 
**** GRAPH *****

What I want is something like this if idle is less than 5%:

Thu Jul 10 10:40:30 CDT 2008
* CPU usage is above 95%
user : 40
sys :  10
wio :   47
idle : 3
 
*** GRAPH ****
 
So, if I add to the ouput the CPU usage is above 95%
will I continue to write to the database the stats?
Last time I tried adding status messages like above,
it didn't seem to put anything in the database.

THanks....James

list Buchan Milne · Fri, 11 Jul 2008 12:37:29 +0200 ·
quoted from James Wade
On Thursday 10 July 2008 17:44:52 James Wade wrote:
Hello All,

I've written my own "sar" test to put sar cpu data
into the database. It's been working fine now for a while.
I don't do any alerting, it's just graphing sar data.
But, the vmstat graphs already graph cpu utilisation, and AFAIK Henrik has 
added alerting based on the vmstat data for cpu utilisation in the current 
development version.

Regards,
Buchan