CPU Utilization -- HELP!!!
list Rich Smrcina
It's entirely feasible that a process (as measured by the CPU Load) caused high utilization (as measured by CPU Utilization) for that period of time. James Wade wrote:
I really need some help on the CPU Utilization graphs. They just don’t look correct. As an example, CPU Load on this box went to 120+, for an hour, but the CPU Utilization Graph for the same time period shows only 13% busy. James
-- Rich Smrcina VM Assist, Inc. Phone: XXX-XXX-XXXX Ans Service: XXX-XXX-XXXX user-61add9955ef9@xymon.invalid Catch the WAVV! http://www.wavv.org WAVV 2007 - Green Bay, WI - May 18-22, 2007
list James Wade
I really need some help on the CPU Utilization graphs. They just don't look correct. As an example, CPU Load on this box went to 120+, for an hour, but the CPU Utilization Graph for the same time period shows only 13% busy. James
list James Wade
I could not send all the graphs in the same email. This is the same system with the other two graphs, but this is a different monitoring tool another group uses. It shows CPU Utilization at 100% which is correct with a load of 120+. However, Hobbit, showed a flatline at 13%.. I'm seeing this across the board on all the systems, and the tool below is being used by the other group showing the discrepancies. Any suggestions? Thanks..James
▸
From: James Wade [mailto:user-659655b2ea05@xymon.invalid]
Sent: Friday, March 09, 2007 12:42 PM
To: user-ae9b8668bcde@xymon.invalid
Subject: [hobbit] RE: CPU Utilization -- HELP!!!
I really need some help on the CPU Utilization graphs.
They just don't look correct.
As an example, CPU Load on this box went to 120+,
for an hour, but the CPU Utilization Graph for the same
time period shows only 13% busy.
James
list Greg L Hubbard
What OS, version, etc. And did you look into your client data file to see what is in there? And has anyone monkeyed with the graph definitions? You have the smoking gun, but that's about all we have to work on...
▸
From: James Wade [mailto:user-659655b2ea05@xymon.invalid]
Sent: Friday, March 09, 2007 12:49 PM
To: user-ae9b8668bcde@xymon.invalid
Subject: RE: [hobbit] RE: CPU Utilization Part 2 -- HELP!!!
I could not send all the graphs in the same email.
This is the same system with the other two graphs,
but this is a different monitoring tool another group
uses. It shows CPU Utilization at 100% which is
correct with a load of 120+. However, Hobbit, showed
a flatline at 13%..
I'm seeing this across the board on all the systems,
and the tool below is being used by the other group
showing the discrepancies. Any suggestions?
Thanks....James
From: James Wade [mailto:user-659655b2ea05@xymon.invalid]
Sent: Friday, March 09, 2007 12:42 PM
To: user-ae9b8668bcde@xymon.invalid
Subject: [hobbit] RE: CPU Utilization -- HELP!!!
I really need some help on the CPU Utilization graphs.
They just don't look correct.
As an example, CPU Load on this box went to 120+,
for an hour, but the CPU Utilization Graph for the same
time period shows only 13% busy.
James
list James Wade
The Operating System is Solaris. The graph definitions have not been changed. What should I look for with the client data? I'm not familiar with rrd. Although if I look at all the clients overall, I see the same thing, the averaging of CPU Utilization is a flat-line on everything, even during peak loads. Thanks.James
▸
From: Hubbard, Greg L [mailto:user-d970b5e56ec9@xymon.invalid]
Sent: Friday, March 09, 2007 12:53 PM
To: user-ae9b8668bcde@xymon.invalid
Subject: RE: [hobbit] RE: CPU Utilization Part 2 -- HELP!!!
What OS, version, etc. And did you look into your client data file to see
what is in there? And has anyone monkeyed with the graph definitions?
You have the smoking gun, but that's about all we have to work on...
From: James Wade [mailto:user-659655b2ea05@xymon.invalid]
Sent: Friday, March 09, 2007 12:49 PM
To: user-ae9b8668bcde@xymon.invalid
Subject: RE: [hobbit] RE: CPU Utilization Part 2 -- HELP!!!
I could not send all the graphs in the same email.
This is the same system with the other two graphs,
but this is a different monitoring tool another group
uses. It shows CPU Utilization at 100% which is
correct with a load of 120+. However, Hobbit, showed
a flatline at 13%..
I'm seeing this across the board on all the systems,
and the tool below is being used by the other group
showing the discrepancies. Any suggestions?
Thanks..James
From: James Wade [mailto:user-659655b2ea05@xymon.invalid]
Sent: Friday, March 09, 2007 12:42 PM
To: user-ae9b8668bcde@xymon.invalid
Subject: [hobbit] RE: CPU Utilization -- HELP!!!
I really need some help on the CPU Utilization graphs.
They just don't look correct.
As an example, CPU Load on this box went to 120+,
for an hour, but the CPU Utilization Graph for the same
time period shows only 13% busy.
James
list Rich Smrcina
I think that info comes from vmstat, take a look at vmstat to see what it reports.
▸
James Wade wrote:The Operating System is Solaris.
The graph definitions have not been changed.
What should I look for with the client data?
I’m not familiar with rrd.
Although if I look at all the clients overall, I see
the same thing, the averaging of CPU Utilization
is a flat-line on everything, even during peak loads.
Thanks…James
*From:* Hubbard, Greg L [mailto:user-d970b5e56ec9@xymon.invalid]
*Sent:* Friday, March 09, 2007 12:53 PM
*To:* user-ae9b8668bcde@xymon.invalid
*Subject:* RE: [hobbit] RE: CPU Utilization Part 2 -- HELP!!!
What OS, version, etc. And did you look into your client data file to see what is in there? And has anyone monkeyed with the graph definitions?
You have the smoking gun, but that's about all we have to work on...
*From:* James Wade [mailto:user-659655b2ea05@xymon.invalid]
*Sent:* Friday, March 09, 2007 12:49 PM
*To:* user-ae9b8668bcde@xymon.invalid
*Subject:* RE: [hobbit] RE: CPU Utilization Part 2 -- HELP!!!
I could not send all the graphs in the same email.
This is the same system with the other two graphs,
but this is a different monitoring tool another group
uses. It shows CPU Utilization at 100% which is
correct with a load of 120+. However, Hobbit, showed
a flatline at 13%..
I’m seeing this across the board on all the systems,
and the tool below is being used by the other group
showing the discrepancies. Any suggestions?
Thanks….James
*From:* James Wade [mailto:user-659655b2ea05@xymon.invalid]
*Sent:* Friday, March 09, 2007 12:42 PM
*To:* user-ae9b8668bcde@xymon.invalid
*Subject:* [hobbit] RE: CPU Utilization -- HELP!!!
I really need some help on the CPU Utilization graphs.
They just don’t look correct.
As an example, CPU Load on this box went to 120+,
for an hour, but the CPU Utilization Graph for the same
time period shows only 13% busy.
James
-- Rich Smrcina VM Assist, Inc. Phone: XXX-XXX-XXXX Ans Service: XXX-XXX-XXXX user-61add9955ef9@xymon.invalid Catch the WAVV! http://www.wavv.org WAVV 2007 - Green Bay, WI - May 18-22, 2007
list Greg L Hubbard
James, I suspect that the CPU utilization numbers come from "vmstat"
output. There should be a section in the client data labeled [vmstat]
and the numbers that are graphed are listed in the far right hand
columns. I guess the [iostatcpu] section could be used as well -- that
adds the "wt" column to "us, sy, and id".
On one of my Solaris systems, I have something that looks like this:
[vmstat]
kthr memory page disk faults
cpu
r b w swap free re mf pi po fr de sr s0 s1 s3 -- in sy cs us
sy id
0 0 0 22683696 6595936 41 257 1 0 0 0 0 3 2 0 0 573 1703 631 1
2 97
0 0 0 22077552 5982992 38 240 0 0 0 0 0 3 3 0 0 608 3125 839 2
2 97
[iostatcpu]
cpu
us sy wt id
1 2 0 97
2 2 0 97
It is possible that your vmstat output is not formatted the way that
Hobbit expects (extra columns?) Your graph looks different -- usually
the CPU utilization graph for Solaris is a stacked area chart, not a
line graph.
Others may be able to chime in with more things to consider. Good luck!
GLH
▸
From: James Wade [mailto:user-659655b2ea05@xymon.invalid]
Sent: Friday, March 09, 2007 1:05 PM
To: user-ae9b8668bcde@xymon.invalid
Subject: RE: [hobbit] RE: CPU Utilization Part 2 -- HELP!!!
The Operating System is Solaris.
The graph definitions have not been changed.
What should I look for with the client data?
I'm not familiar with rrd.
Although if I look at all the clients overall, I see
the same thing, the averaging of CPU Utilization
is a flat-line on everything, even during peak loads.
Thanks...James
From: Hubbard, Greg L [mailto:user-d970b5e56ec9@xymon.invalid]
Sent: Friday, March 09, 2007 12:53 PM
To: user-ae9b8668bcde@xymon.invalid
Subject: RE: [hobbit] RE: CPU Utilization Part 2 -- HELP!!!
What OS, version, etc. And did you look into your client data
file to see what is in there? And has anyone monkeyed with the graph
definitions?
You have the smoking gun, but that's about all we have to work
on...
From: James Wade [mailto:user-659655b2ea05@xymon.invalid]
Sent: Friday, March 09, 2007 12:49 PM
To: user-ae9b8668bcde@xymon.invalid
Subject: RE: [hobbit] RE: CPU Utilization Part 2 --
HELP!!!
I could not send all the graphs in the same email.
This is the same system with the other two graphs,
but this is a different monitoring tool another group
uses. It shows CPU Utilization at 100% which is
correct with a load of 120+. However, Hobbit, showed
a flatline at 13%..
I'm seeing this across the board on all the systems,
and the tool below is being used by the other group
showing the discrepancies. Any suggestions?
Thanks....James
From: James Wade [mailto:user-659655b2ea05@xymon.invalid]
Sent: Friday, March 09, 2007 12:42 PM
To: user-ae9b8668bcde@xymon.invalid
Subject: [hobbit] RE: CPU Utilization -- HELP!!!
I really need some help on the CPU Utilization graphs.
They just don't look correct.
As an example, CPU Load on this box went to 120+,
for an hour, but the CPU Utilization Graph for the same
time period shows only 13% busy.
James
list James Wade
Hi Greg, Yes, it does come from the vmstat. Hobbit does a 5 minute average using vmstat. I think this is to long and perhaps doesn't work as well on Solaris. I suspect that when you get a system that gets overloaded, the 5 minute average is taking a while to complete, almost hanging. I believe that a better method of CPU utilization would be to take 15 second averages over 2 minutes, ie. 8 data samples, and put that in hobbit. I've tried to do this, but I have not had much luck. I can't seem to get the rrd correct. What I did do as a test was to change the 300 second average in the hobbitclient-sunos.sh file to a 15 second sample, and I was able to see better indications of CPU utilization. I know we can debate a 15 second CPU utilization average verses a 5 minute, but what I've seen is the 5 minute average just isn't working on about 100 Solaris boxes. It stays flatline at a low percentage when the CPU is max'd. However, having a 15 second average every 5 minutes isn't going to do it either because you miss more than 4 minutes until hobbit runs again and takes another 15 second average. This has become a major issue here because they want to compare CPU Utilization with the number of transactions in the Application log file. I could really use a work around.
▸
Thanks.James
From: Hubbard, Greg L [mailto:user-d970b5e56ec9@xymon.invalid]
Sent: Friday, March 09, 2007 1:15 PM
To: user-ae9b8668bcde@xymon.invalid
Subject: RE: [hobbit] RE: CPU Utilization Part 2 -- HELP!!!
James, I suspect that the CPU utilization numbers come from "vmstat" output.
There should be a section in the client data labeled [vmstat] and the
numbers that are graphed are listed in the far right hand columns. I guess
the [iostatcpu] section could be used as well -- that adds the "wt" column
to "us, sy, and id".
On one of my Solaris systems, I have something that looks like this:
[vmstat]
kthr memory page disk faults cpu
r b w swap free re mf pi po fr de sr s0 s1 s3 -- in sy cs us sy
id
0 0 0 22683696 6595936 41 257 1 0 0 0 0 3 2 0 0 573 1703 631 1 2
97
0 0 0 22077552 5982992 38 240 0 0 0 0 0 3 3 0 0 608 3125 839 2 2
97
[iostatcpu]
cpu
us sy wt id
1 2 0 97
2 2 0 97
It is possible that your vmstat output is not formatted the way that Hobbit
expects (extra columns?) Your graph looks different -- usually the CPU
utilization graph for Solaris is a stacked area chart, not a line graph.
Others may be able to chime in with more things to consider. Good luck!
GLH
From: James Wade [mailto:user-659655b2ea05@xymon.invalid]
Sent: Friday, March 09, 2007 1:05 PM
To: user-ae9b8668bcde@xymon.invalid
Subject: RE: [hobbit] RE: CPU Utilization Part 2 -- HELP!!!
The Operating System is Solaris.
The graph definitions have not been changed.
What should I look for with the client data?
I'm not familiar with rrd.
Although if I look at all the clients overall, I see
the same thing, the averaging of CPU Utilization
is a flat-line on everything, even during peak loads.
Thanks.James
From: Hubbard, Greg L [mailto:user-d970b5e56ec9@xymon.invalid]
Sent: Friday, March 09, 2007 12:53 PM
To: user-ae9b8668bcde@xymon.invalid
Subject: RE: [hobbit] RE: CPU Utilization Part 2 -- HELP!!!
What OS, version, etc. And did you look into your client data file to see
what is in there? And has anyone monkeyed with the graph definitions?
You have the smoking gun, but that's about all we have to work on...
From: James Wade [mailto:user-659655b2ea05@xymon.invalid]
Sent: Friday, March 09, 2007 12:49 PM
To: user-ae9b8668bcde@xymon.invalid
Subject: RE: [hobbit] RE: CPU Utilization Part 2 -- HELP!!!
I could not send all the graphs in the same email.
This is the same system with the other two graphs,
but this is a different monitoring tool another group
uses. It shows CPU Utilization at 100% which is
correct with a load of 120+. However, Hobbit, showed
a flatline at 13%..
I'm seeing this across the board on all the systems,
and the tool below is being used by the other group
showing the discrepancies. Any suggestions?
Thanks..James
From: James Wade [mailto:user-659655b2ea05@xymon.invalid]
Sent: Friday, March 09, 2007 12:42 PM
To: user-ae9b8668bcde@xymon.invalid
Subject: [hobbit] RE: CPU Utilization -- HELP!!!
I really need some help on the CPU Utilization graphs.
They just don't look correct.
As an example, CPU Load on this box went to 120+,
for an hour, but the CPU Utilization Graph for the same
time period shows only 13% busy.
James