Xymon Mailing List Archive search

CPU Utilization -- HELP!!!

8 messages in this thread

list Rich Smrcina · Fri, 09 Mar 2007 12:40:10 -0600 ·
It's entirely feasible that a process (as measured by the CPU Load) caused high utilization (as measured by CPU Utilization) for that period of time.

James Wade wrote:
I really need some help on the CPU Utilization graphs.

They just don’t look correct.

 
As an example, CPU Load on this box went to 120+,

for an hour, but the CPU Utilization Graph for the same

time period shows only 13% busy.

 
James

 
-- 
Rich Smrcina
VM Assist, Inc.
Phone: XXX-XXX-XXXX
Ans Service:  XXX-XXX-XXXX
user-61add9955ef9@xymon.invalid

Catch the WAVV!  http://www.wavv.org
WAVV 2007 - Green Bay, WI - May 18-22, 2007
list James Wade · Fri, 9 Mar 2007 12:41:58 -0600 ·
I really need some help on the CPU Utilization graphs.

They just don't look correct.

 
As an example, CPU Load on this box went to 120+,

for an hour, but the CPU Utilization Graph for the same

time period shows only 13% busy. 

 
James
list James Wade · Fri, 9 Mar 2007 12:48:30 -0600 ·
I could not send all the graphs in the same email.

This is the same system with the other two graphs,

but this is a different monitoring tool another group

uses. It shows CPU Utilization at 100% which is

correct with a load of 120+. However, Hobbit, showed

a flatline at 13%..

 
I'm seeing this across the board on all the systems,

and the tool below is being used by the other group

showing the discrepancies. Any suggestions?

 
Thanks..James
quoted from James Wade


From: James Wade [mailto:user-659655b2ea05@xymon.invalid] 
Sent: Friday, March 09, 2007 12:42 PM
To: user-ae9b8668bcde@xymon.invalid
Subject: [hobbit] RE: CPU Utilization -- HELP!!!

 
I really need some help on the CPU Utilization graphs.

They just don't look correct.

 
As an example, CPU Load on this box went to 120+,

for an hour, but the CPU Utilization Graph for the same

time period shows only 13% busy. 

 
James
list Greg L Hubbard · Fri, 9 Mar 2007 12:52:53 -0600 ·
What OS, version, etc.  And did you look into your client data file to
see what is in there?  And has anyone monkeyed with the graph
definitions?
 
You have the smoking gun, but that's about all we have to work on...
quoted from James Wade


	From: James Wade [mailto:user-659655b2ea05@xymon.invalid] 
	Sent: Friday, March 09, 2007 12:49 PM
	To: user-ae9b8668bcde@xymon.invalid
	Subject: RE: [hobbit] RE: CPU Utilization Part 2 -- HELP!!!
	
	
	I could not send all the graphs in the same email.

	This is the same system with the other two graphs,

	but this is a different monitoring tool another group

	uses. It shows CPU Utilization at 100% which is

	correct with a load of 120+. However, Hobbit, showed

	a flatline at 13%..

	 
	I'm seeing this across the board on all the systems,

	and the tool below is being used by the other group

	showing the discrepancies. Any suggestions?

	 
	Thanks....James

	
	From: James Wade [mailto:user-659655b2ea05@xymon.invalid] 
	Sent: Friday, March 09, 2007 12:42 PM
	To: user-ae9b8668bcde@xymon.invalid
	Subject: [hobbit] RE: CPU Utilization -- HELP!!!

	 
	I really need some help on the CPU Utilization graphs.

	They just don't look correct.

	 
	As an example, CPU Load on this box went to 120+,

	for an hour, but the CPU Utilization Graph for the same

	time period shows only 13% busy. 

	 
	James
list James Wade · Fri, 9 Mar 2007 13:04:39 -0600 ·
The Operating System is Solaris.

The graph definitions have not been changed.

 
What should I look for with the client data?
I'm not familiar with rrd.

 
Although if I look at all the clients overall, I see

the same thing, the averaging of CPU Utilization

is a flat-line on everything, even during peak loads.

 
Thanks.James
quoted from Greg L Hubbard

 
From: Hubbard, Greg L [mailto:user-d970b5e56ec9@xymon.invalid] 
Sent: Friday, March 09, 2007 12:53 PM
To: user-ae9b8668bcde@xymon.invalid
Subject: RE: [hobbit] RE: CPU Utilization Part 2 -- HELP!!!

 
What OS, version, etc.  And did you look into your client data file to see
what is in there?  And has anyone monkeyed with the graph definitions?

 
You have the smoking gun, but that's about all we have to work on...

 
From: James Wade [mailto:user-659655b2ea05@xymon.invalid] 
Sent: Friday, March 09, 2007 12:49 PM
To: user-ae9b8668bcde@xymon.invalid
Subject: RE: [hobbit] RE: CPU Utilization Part 2 -- HELP!!!

I could not send all the graphs in the same email.

This is the same system with the other two graphs,

but this is a different monitoring tool another group

uses. It shows CPU Utilization at 100% which is

correct with a load of 120+. However, Hobbit, showed

a flatline at 13%..

 
I'm seeing this across the board on all the systems,

and the tool below is being used by the other group

showing the discrepancies. Any suggestions?

 
Thanks..James


From: James Wade [mailto:user-659655b2ea05@xymon.invalid] 
Sent: Friday, March 09, 2007 12:42 PM
To: user-ae9b8668bcde@xymon.invalid
Subject: [hobbit] RE: CPU Utilization -- HELP!!!

 
I really need some help on the CPU Utilization graphs.

They just don't look correct.

 
As an example, CPU Load on this box went to 120+,

for an hour, but the CPU Utilization Graph for the same

time period shows only 13% busy. 

 
James
list Rich Smrcina · Fri, 09 Mar 2007 13:05:46 -0600 ·
I think that info comes from vmstat, take a look at vmstat to see what it reports.
quoted from James Wade

James Wade wrote:
The Operating System is Solaris.

The graph definitions have not been changed.

 
What should I look for with the client data?
I’m not familiar with rrd.

 
Although if I look at all the clients overall, I see

the same thing, the averaging of CPU Utilization

is a flat-line on everything, even during peak loads.

 
Thanks…James

 
*From:* Hubbard, Greg L [mailto:user-d970b5e56ec9@xymon.invalid]
*Sent:* Friday, March 09, 2007 12:53 PM
*To:* user-ae9b8668bcde@xymon.invalid
*Subject:* RE: [hobbit] RE: CPU Utilization Part 2 -- HELP!!!

 
What OS, version, etc.  And did you look into your client data file to see what is in there?  And has anyone monkeyed with the graph definitions?

 
You have the smoking gun, but that's about all we have to work on...

     
    *From:* James Wade [mailto:user-659655b2ea05@xymon.invalid]
    *Sent:* Friday, March 09, 2007 12:49 PM
    *To:* user-ae9b8668bcde@xymon.invalid
    *Subject:* RE: [hobbit] RE: CPU Utilization Part 2 -- HELP!!!

    I could not send all the graphs in the same email.

    This is the same system with the other two graphs,

    but this is a different monitoring tool another group

    uses. It shows CPU Utilization at 100% which is

    correct with a load of 120+. However, Hobbit, showed

    a flatline at 13%..

     
    I’m seeing this across the board on all the systems,

    and the tool below is being used by the other group

    showing the discrepancies. Any suggestions?

     
    Thanks….James


    *From:* James Wade [mailto:user-659655b2ea05@xymon.invalid]
    *Sent:* Friday, March 09, 2007 12:42 PM
    *To:* user-ae9b8668bcde@xymon.invalid
    *Subject:* [hobbit] RE: CPU Utilization -- HELP!!!

     
    I really need some help on the CPU Utilization graphs.

    They just don’t look correct.

     
    As an example, CPU Load on this box went to 120+,

    for an hour, but the CPU Utilization Graph for the same

    time period shows only 13% busy.

     
    James

     
-- 
Rich Smrcina
VM Assist, Inc.
Phone: XXX-XXX-XXXX
Ans Service:  XXX-XXX-XXXX
user-61add9955ef9@xymon.invalid

Catch the WAVV!  http://www.wavv.org
WAVV 2007 - Green Bay, WI - May 18-22, 2007
list Greg L Hubbard · Fri, 9 Mar 2007 13:14:54 -0600 ·
James, I suspect that the CPU utilization numbers come from "vmstat"
output.  There should be a section in the client data labeled [vmstat]
and the numbers that are graphed are listed in the far right hand
columns.  I guess the [iostatcpu] section could be used as well -- that
adds the "wt" column to "us, sy, and id".
 
On one of my Solaris systems, I have something that looks like this:
 
[vmstat]
kthr      memory            page            disk          faults
cpu
r b w   swap  free  re  mf pi po fr de sr s0 s1 s3 --   in   sy   cs us
sy id
0 0 0 22683696 6595936 41 257 1 0 0  0  0  3  2  0  0  573 1703  631  1
2 97
0 0 0 22077552 5982992 38 240 0 0 0  0  0  3  3  0  0  608 3125  839  2
2 97
[iostatcpu]
     cpu
us sy wt id
  1  2  0 97
  2  2  0 97

It is possible that your vmstat output is not formatted the way that
Hobbit expects (extra columns?)  Your graph looks different -- usually
the CPU utilization graph for Solaris is a stacked area chart, not a
line graph.
 
Others may be able to chime in with more things to consider.  Good luck!
 
GLH
quoted from James Wade


	From: James Wade [mailto:user-659655b2ea05@xymon.invalid] 
	Sent: Friday, March 09, 2007 1:05 PM
	To: user-ae9b8668bcde@xymon.invalid
	Subject: RE: [hobbit] RE: CPU Utilization Part 2 -- HELP!!!
	
	
	The Operating System is Solaris.

	The graph definitions have not been changed.

	 
	What should I look for with the client data?
	I'm not familiar with rrd.

	 
	Although if I look at all the clients overall, I see

	the same thing, the averaging of CPU Utilization

	is a flat-line on everything, even during peak loads.

	 
	Thanks...James

	 
	From: Hubbard, Greg L [mailto:user-d970b5e56ec9@xymon.invalid] 
	Sent: Friday, March 09, 2007 12:53 PM
	To: user-ae9b8668bcde@xymon.invalid
	Subject: RE: [hobbit] RE: CPU Utilization Part 2 -- HELP!!!

	 
	What OS, version, etc.  And did you look into your client data
file to see what is in there?  And has anyone monkeyed with the graph
definitions?

	 
	You have the smoking gun, but that's about all we have to work
on...

		 
		From: James Wade [mailto:user-659655b2ea05@xymon.invalid] 
		Sent: Friday, March 09, 2007 12:49 PM
		To: user-ae9b8668bcde@xymon.invalid
		Subject: RE: [hobbit] RE: CPU Utilization Part 2 --
HELP!!!

		I could not send all the graphs in the same email.

		This is the same system with the other two graphs,

		but this is a different monitoring tool another group

		uses. It shows CPU Utilization at 100% which is

		correct with a load of 120+. However, Hobbit, showed

		a flatline at 13%..

		 
		I'm seeing this across the board on all the systems,

		and the tool below is being used by the other group

		showing the discrepancies. Any suggestions?

		 
		Thanks....James

		
		From: James Wade [mailto:user-659655b2ea05@xymon.invalid] 
		Sent: Friday, March 09, 2007 12:42 PM
		To: user-ae9b8668bcde@xymon.invalid
		Subject: [hobbit] RE: CPU Utilization -- HELP!!!

		 
		I really need some help on the CPU Utilization graphs.

		They just don't look correct.

		 
		As an example, CPU Load on this box went to 120+,

		for an hour, but the CPU Utilization Graph for the same

		time period shows only 13% busy. 

		 
		James
list James Wade · Fri, 9 Mar 2007 13:41:00 -0600 ·
Hi Greg,

 
Yes, it does come from the vmstat. Hobbit does a 5 minute average

using vmstat. I think this is to long and perhaps doesn't work as well

on Solaris. I suspect that when you get a system that gets overloaded,

the 5 minute average is taking a while to complete, almost hanging.

 
I believe that a better method of CPU utilization would be to take 15 second

averages over 2 minutes, ie. 8 data samples, and put that in hobbit.

I've tried to do this, but I have not had much luck. I can't seem to get the

rrd correct.

 
What I did do as a test was to change the 300 second average in

the hobbitclient-sunos.sh file to a 15 second sample, and I was able

to see better indications of CPU utilization. I know we can debate a

15 second CPU utilization average verses a 5 minute, but what I've

seen is the 5 minute average just isn't working on about 100 Solaris

boxes. It stays flatline at a low percentage when the CPU is max'd.

 
However, having a 15 second average every 5 minutes isn't going to

do it either because you miss more than 4 minutes until hobbit

runs again and takes another 15 second average.

 
This has become a major issue here because they want to compare

CPU Utilization with the number of transactions in the Application

log file.

 
I could really use a work around.
quoted from Greg L Hubbard

 
Thanks.James


From: Hubbard, Greg L [mailto:user-d970b5e56ec9@xymon.invalid] 
Sent: Friday, March 09, 2007 1:15 PM
To: user-ae9b8668bcde@xymon.invalid
Subject: RE: [hobbit] RE: CPU Utilization Part 2 -- HELP!!!

 
James, I suspect that the CPU utilization numbers come from "vmstat" output.
There should be a section in the client data labeled [vmstat] and the
numbers that are graphed are listed in the far right hand columns.  I guess
the [iostatcpu] section could be used as well -- that adds the "wt" column
to "us, sy, and id".

 
On one of my Solaris systems, I have something that looks like this:

 
[vmstat]
kthr      memory            page            disk          faults      cpu
r b w   swap  free  re  mf pi po fr de sr s0 s1 s3 --   in   sy   cs us sy
id
0 0 0 22683696 6595936 41 257 1 0 0  0  0  3  2  0  0  573 1703  631  1  2
97
0 0 0 22077552 5982992 38 240 0 0 0  0  0  3  3  0  0  608 3125  839  2  2
97
[iostatcpu]
     cpu
us sy wt id
  1  2  0 97
  2  2  0 97

It is possible that your vmstat output is not formatted the way that Hobbit
expects (extra columns?)  Your graph looks different -- usually the CPU
utilization graph for Solaris is a stacked area chart, not a line graph.

 
Others may be able to chime in with more things to consider.  Good luck!

 
GLH

 
From: James Wade [mailto:user-659655b2ea05@xymon.invalid] 
Sent: Friday, March 09, 2007 1:05 PM
To: user-ae9b8668bcde@xymon.invalid
Subject: RE: [hobbit] RE: CPU Utilization Part 2 -- HELP!!!

The Operating System is Solaris.

The graph definitions have not been changed.

 
What should I look for with the client data?
I'm not familiar with rrd.

 
Although if I look at all the clients overall, I see

the same thing, the averaging of CPU Utilization

is a flat-line on everything, even during peak loads.

 
Thanks.James

 
From: Hubbard, Greg L [mailto:user-d970b5e56ec9@xymon.invalid] 
Sent: Friday, March 09, 2007 12:53 PM
To: user-ae9b8668bcde@xymon.invalid
Subject: RE: [hobbit] RE: CPU Utilization Part 2 -- HELP!!!

 
What OS, version, etc.  And did you look into your client data file to see
what is in there?  And has anyone monkeyed with the graph definitions?

 
You have the smoking gun, but that's about all we have to work on...

 
From: James Wade [mailto:user-659655b2ea05@xymon.invalid] 
Sent: Friday, March 09, 2007 12:49 PM
To: user-ae9b8668bcde@xymon.invalid
Subject: RE: [hobbit] RE: CPU Utilization Part 2 -- HELP!!!

I could not send all the graphs in the same email.

This is the same system with the other two graphs,

but this is a different monitoring tool another group

uses. It shows CPU Utilization at 100% which is

correct with a load of 120+. However, Hobbit, showed

a flatline at 13%..

 
I'm seeing this across the board on all the systems,

and the tool below is being used by the other group

showing the discrepancies. Any suggestions?

 
Thanks..James


From: James Wade [mailto:user-659655b2ea05@xymon.invalid] 
Sent: Friday, March 09, 2007 12:42 PM
To: user-ae9b8668bcde@xymon.invalid
Subject: [hobbit] RE: CPU Utilization -- HELP!!!

 
I really need some help on the CPU Utilization graphs.

They just don't look correct.

 
As an example, CPU Load on this box went to 120+,

for an hour, but the CPU Utilization Graph for the same

time period shows only 13% busy. 

 
James