Xymon Mailing List Archive search

CPU Utilization, Less Averaging

5 messages in this thread

list James Wade · Fri, 16 Feb 2007 12:31:16 -0600 ·
Can I set the vmstat collection to average less

and collect more data points? The users here

don't agree with the CPU Utilization Graph,

they don't like the way the Average doesn't

fluctuate much.

 
Basically, they are comparing the Load graph

with the Utilization Graph and want to see the

graph correlate more with the Load Graph.

 
My understanding is that the Utilization Graph

is using a 5 minute cpu idle average from vmstat.

 
Can I change vmstat to take a 15 second average

as an example, and do more counts so more datapoints 

are graphed?

 
Any suggestions would be appreciated.

 
Thanks.James
list Greg L Hubbard · Fri, 16 Feb 2007 13:29:36 -0600 ·
Why do users think that the load average graph (which measures the
average length of the CPU run queue) should correlate with the CPU
utilization (busy time divided by available time)?  The values in each
graph are calculated very differently.
 
Just wondering...
 
GLH
quoted from James Wade


	From: James Wade [mailto:user-659655b2ea05@xymon.invalid] 
	Sent: Friday, February 16, 2007 12:31 PM
	To: user-ae9b8668bcde@xymon.invalid
	Subject: [hobbit] CPU Utilization, Less Averaging
	
	
	Can I set the vmstat collection to average less

	and collect more data points? The users here

	don't agree with the CPU Utilization Graph,

	they don't like the way the Average doesn't

	fluctuate much.

	 
	Basically, they are comparing the Load graph

	with the Utilization Graph and want to see the

	graph correlate more with the Load Graph.

	 
	My understanding is that the Utilization Graph

	is using a 5 minute cpu idle average from vmstat.

	 
	Can I change vmstat to take a 15 second average

	as an example, and do more counts so more datapoints 

	are graphed?

	 
	Any suggestions would be appreciated.

	 
	Thanks...James
list James Wade · Fri, 16 Feb 2007 13:55:41 -0600 ·
True, and I've checked, by looking at what the cpu utilization

should be compared to the load average. Everything looks fine.

 
To be honest, it's user perception. They want to see a

graph that doesn't average but shows the peaks and troughs.

There is another tool here that does cpu utilization, and it

doesn't average and shows more peaks and troughs.

 
So, I just want to drop the averaging so that it takes more samples,

and less averaging. 

 
As an example, they are looking at load doubling,

but the cpu utilization graph shows no change. However, because

the box is a mutli-cpu box, the load doubling in this case only

means an average increase in utilization by 2%. However, if the

load goes down within 5 minutes, the 2% isn't registered, so

the graph gives a constant cpu average. 

 
What I have seen is the Utilization may go up by 10% or 15%

for only a few seconds, then drop down, (As seen by TOP)

This normally occurs when a large job hits. Because the cpu 

utilization is an average on a 5 minute basis, there is not 

normally a change in the graph.

 
I figure if I drop the sample down, I'll get more peaks and troughs

and make them happy. How can I do this in Hobbit?

 
James
quoted from Greg L Hubbard

 
From: Hubbard, Greg L [mailto:user-d970b5e56ec9@xymon.invalid] 
Sent: Friday, February 16, 2007 1:30 PM
To: user-ae9b8668bcde@xymon.invalid
Subject: RE: [hobbit] CPU Utilization, Less Averaging

 
Why do users think that the load average graph (which measures the average
length of the CPU run queue) should correlate with the CPU utilization (busy
time divided by available time)?  The values in each graph are calculated
very differently.

 
Just wondering...

 
GLH

 
From: James Wade [mailto:user-659655b2ea05@xymon.invalid] 
Sent: Friday, February 16, 2007 12:31 PM
To: user-ae9b8668bcde@xymon.invalid
Subject: [hobbit] CPU Utilization, Less Averaging

Can I set the vmstat collection to average less

and collect more data points? The users here

don't agree with the CPU Utilization Graph,

they don't like the way the Average doesn't

fluctuate much.

 
Basically, they are comparing the Load graph

with the Utilization Graph and want to see the

graph correlate more with the Load Graph.

 
My understanding is that the Utilization Graph

is using a 5 minute cpu idle average from vmstat.

 
Can I change vmstat to take a 15 second average

as an example, and do more counts so more datapoints 

are graphed?

 
Any suggestions would be appreciated.

 
Thanks.James
list Henrik Størner · Fri, 16 Feb 2007 22:42:18 +0100 ·
quoted from James Wade
On Fri, Feb 16, 2007 at 12:31:16PM -0600, James Wade wrote:
Can I set the vmstat collection to average less
and collect more data points?
You can change the collection to run every 15 seconds, but you will have
to change the RRD file definition also so it knows that you intend to
feed it data that often. Otherwise it will just discard your extra
datapoints.
quoted from James Wade

The users here don't agree with the CPU Utilization Graph,
they don't like the way the Average doesn't fluctuate much.

Basically, they are comparing the Load graph with the Utilization 
Graph and want to see the graph correlate more with the Load Graph.
Your users are comparing apples and oranges - those two numbers have
very little to do with each other. 

Here's the explanation of the "load average" that you see in the "Load"
graph:

    System load averages is the average number of processes that are
    either in a runnable or uninterruptable state.  A  process  in a 
    runnable state is either using the CPU or waiting to use the CPU. 
    A process in uninterruptable state is waiting for some I/O access, 
    eg waiting for disk.  The averages are taken over the three time 
    intervals.  Load averages are not  normalized  for  the number of 
    CPUs in a system, so a load average of 1 means a single CPU system 
    is loaded all the time while on a 4 CPU system it means it was idle 
    75% of the time.

The CPU utilisation graph IS normalized over the number of CPUs - that
in itself makes a difference (unless you have single CPU systems). But
more importantly, a process will show up on the "CPU utilisation" graph
ONLY when it is using CPU time; and it will show up on the "load
average" graph when it is using CPU time, AND when it is waiting to be
scheduled for some CPU time, AND while it is waiting for I/O to
complete.

Since most server tasks are very I/O bound you will often see little
correlation between the two graphs. In fact, if the correlation becomes
too close it is usually a sign that your system needs more CPU power.


Regards,
Henrik
list Henrik Størner · Fri, 16 Feb 2007 22:47:44 +0100 ·
quoted from James Wade
On Fri, Feb 16, 2007 at 01:55:41PM -0600, James Wade wrote:
To be honest, it's user perception. They want to see a
graph that doesn't average but shows the peaks and troughs.
Have you pointed out that the load average graph is based on a
"5 minute average" number calculated by the OS ?


Henrik