iostat monitor
list Vernon Everett
Hi all Just posted an iostat add-on, with graphing goodness to http://xymonton.trantor.org/doku.php/monitors:diskstat.ksh Something interesting, for perverse values of interesting, is myself and Roland Soderstrom were discussing it in the thread titled custom check / graphs with multiple rrd files. None of my last 3 posts to that thread have appeared in the list archive. This is not the first time this has happened to me. It's beginning to look that posts are dropping off threads if the thread gets too long. Is this a mailing list issue, or a Gmail issue? Anybody else with Gmail experience this? Regards Vernon
list Vernon Everett
Just updated the code. Minor change.
After discussion with a colleague we thought that a 2 second iostat sample
might not be adequate.
We settled on 10, but with the update, I have made it a parameter, which can
be set at server side.
Cheers
Vernon
▸
On Thu, Sep 9, 2010 at 1:56 PM, Vernon Everett <user-b3f8dacb72c8@xymon.invalid>wrote:
Hi all Just posted an iostat add-on, with graphing goodness to http://xymonton.trantor.org/doku.php/monitors:diskstat.ksh
list Ralph Mitchell
I see 12 messages in the thread, the last being yours about 6 hours ago
saying:
"Added the hobbitgraph stuff."
Ralph Mitchell
▸
On Thu, Sep 9, 2010 at 1:56 AM, Vernon Everett <user-b3f8dacb72c8@xymon.invalid>wrote:
Hi all Just posted an iostat add-on, with graphing goodness to http://xymonton.trantor.org/doku.php/monitors:diskstat.ksh Something interesting, for perverse values of interesting, is myself and Roland Soderstrom were discussing it in the thread titled custom check / graphs with multiple rrd files. None of my last 3 posts to that thread have appeared in the list archive. This is not the first time this has happened to me. It's beginning to look that posts are dropping off threads if the thread gets too long. Is this a mailing list issue, or a Gmail issue? Anybody else with Gmail experience this? Regards Vernon
list Martin Flemming
Great work ! one question, is it possible to added iostat for zpools ? Especially for Solaris 10 machines i think, zpools are the defaults, isn't it ? thanks & cheers martin
▸
On Thu, 9 Sep 2010, Ralph Mitchell wrote:
I see 12 messages in the thread, the last being yours about 6 hours ago saying:
"Added the hobbitgraph stuff."
Ralph Mitchell
On Thu, Sep 9, 2010 at 1:56 AM, Vernon Everett <user-b3f8dacb72c8@xymon.invalid> wrote:
Hi all
Just posted an iostat add-on, with graphing goodness to
http://xymonton.trantor.org/doku.php/monitors:diskstat.ksh
Something interesting, for perverse values of interesting, is myself and Roland Soderstrom were discussing it in the threa
d titled
▸
custom check / graphs with multiple rrd files.
None of my last 3 posts to that thread have appeared in the list archive.
This is not the first time this has happened to me.
It's beginning to look that posts are dropping off threads if the thread gets too long.
Is this a mailing list issue, or a Gmail issue?
Anybody else with Gmail experience this?
Regards
Vernon
list Steve Holmes
▸
On Thu, Sep 9, 2010 at 1:56 AM, Vernon Everett <user-b3f8dacb72c8@xymon.invalid>wrote:
Hi all Just posted an iostat add-on, with graphing goodness to http://xymonton.trantor.org/doku.php/monitors:diskstat.ksh ...
Regards
Vernon
Vernon, thanks for this add on. It looks useful. However, I have
installed it per the instructions and on the client it appears to be doing
the right thing (both client and server are Solaris 10 and I'm running Xymon4.2.3). Does it take a while for the graphs to fill in? I've been waiting for a couple of hours and they show up as broken graphs on the trends page. Not empty graphs, but broken. Does it matter that the SPLITNCV lines in hobbitgraph.cfg are the first ones I've added? Thanks, Steve --
list Jerald Sheets
Converting to BASH now... (AIX? HP/UX? Why ksh?) --j
▸
On Sep 9, 2010, at 1:49 PM, Steve Holmes wrote:
On Thu, Sep 9, 2010 at 1:56 AM, Vernon Everett <user-b3f8dacb72c8@xymon.invalid> wrote: Hi all Just posted an iostat add-on, with graphing goodness to http://xymonton.trantor.org/doku.php/monitors:diskstat.ksh ... Regards Vernon Vernon, thanks for this add on. It looks useful. However, I have installed it per the instructions and on the client it appears to be doing the right thing (both client and server are Solaris 10 and I'm running Xymon 4.2.3). Does it take a while for the graphs to fill in? I've been waiting for a couple of hours and they show up as broken graphs on the trends page. Not empty graphs, but broken. Does it matter that the SPLITNCV lines in hobbitgraph.cfg are the first ones I've added? Thanks, Steve --
list Xymon User in Richmond
On Thu, September 9, 2010 14:03, Jerald Sheets wrote:
Converting to BASH now...
Thanks, Jerald. Was going to have a look at that tomorrow. If you'll post when done, I'll test for RHEL4-5.
list Roland Soderstrom
Vernon, I have an comment.
---------------------------------------------------------------------------------------------------------------
- -
- 100 dollar bill - -
- If you have a 10sec iostat which you run every 5min, as you set the test to run, you get huge gaps. -
- Say that your server always have a lot of IO every 3min for 1 min. -
- You will probably miss all this IO, set it to the same as the clientlaunch time 5min = 300 in the script. -
- iostat need to run constantly to make accurate measurements. -
- That's why you see the default do 300 sec. -
- -
---------------------------------------------------------------------------------------------------------------
To get better accuracy you should probably run "iostat dxn 300 13" and run the script from clientlaunch every hour.
A bit more coding to take care of the 11 "new" iostat though.
- Roland
---------------------------------------------------------------------------------------------------------------
- -
- 100 dollar bill - -
- If you have a 10sec iostat which you run every 5min, as you set the test to run, you get huge gaps. -
- Say that your server always have a lot of IO every 3min for 1 min. -
- You will probably miss all this IO, set it to the same as the clientlaunch time 5min = 300 in the script. -
- iostat need to run constantly to make accurate measurements. -
- That's why you see the default do 300 sec. -
- -
---------------------------------------------------------------------------------------------------------------
To get better accuracy you should probably run "iostat dxn 300 13" and run the script from clientlaunch every hour.
A bit more coding to take care of the 11 "new" iostat though.
- Roland
▸
On 9/09/10 06:56 PM, Vernon Everett wrote:Just updated the code. Minor change.
After discussion with a colleague we thought that a 2 second iostat sample might not be adequate.
We settled on 10, but with the update, I have made it a parameter, which can be set at server side.
Cheers
VernonOn Thu, Sep 9, 2010 at 1:56 PM, Vernon Everett <user-b3f8dacb72c8@xymon.invalid> wrote:
Hi all
Just posted an iostat add-on, with graphing goodness tohttp://xymonton.trantor.org/doku.php/monitors:diskstat.ksh
list Vernon Everett
Hi Roland
A good comment, and one I did consider.
My concern with a long sample, is that it becomes "too averaged".
If I put one hand in the fire, and the other in liquid nitrogen, on average,
I am doing fine, right? :-)
I think it's one of those things that will need to be flavoured to taste,
depending on your requirements, which is why I changed it to be a soft
parameter that could be set at server-side.
Ultimately, there is no difference in the results of running iostat 300 13
every hour, and iostat 300 2 every 5 minutes for one hour.
And will save a lot of coding :-)
Cheers
Vernon
On Fri, Sep 10, 2010 at 5:56 AM, Roland Soderstrom <
▸
user-0cec9512a49f@xymon.invalid> wrote:
Vernon, I have an comment. • - - 100 dollar bill • - - If you have a 10sec iostat which you run every 5min, as you set the test to run, you get huge gaps. - - Say that your server always have a lot of IO every 3min for 1 min. - - You will probably miss all this IO, set it to the same as the clientlaunch time 5min = 300 in the script. - - iostat need to run constantly to make accurate measurements. • - That's why you see the default do 300 sec. - • - To get better accuracy you should probably run "iostat dxn 300 13" and run the script from clientlaunch every hour. A bit more coding to take care of the 11 "new" iostat though. - Roland On 9/09/10 06:56 PM, Vernon Everett wrote: Just updated the code. Minor change. After discussion with a colleague we thought that a 2 second iostat sample might not be adequate. We settled on 10, but with the update, I have made it a parameter, which can be set at server side. Cheers Vernon On Thu, Sep 9, 2010 at 1:56 PM, Vernon Everett <user-b3f8dacb72c8@xymon.invalid>wrote:Hi all Just posted an iostat add-on, with graphing goodness to http://xymonton.trantor.org/doku.php/monitors:diskstat.ksh
list Vernon Everett
I could just adopt a dogmatic stance, and say, "My script, my choice". End of argument! As true as this may be, all it will do is start a shell Jihad, and the last thing we need on this list is a religious war. For a less dogmatic answer, I have found some "inconsistencies" with some versions of bash and sh which I don't like. The most annoying one being that variables defined within some looping constructs are no longer defined once you exit the loop. But there are others. That's my logical reason. The other reason, is my comfort zone. I worked for Sun Microsystems for a few years, mostly writing scripts to test the OS components. To ensure complete portability over our entire fleet (which included Linux back then) everything was written in ksh. I got to know it well, and liked it. I am comfortable with it. I always assert, write your code in whatever you are most comfortable with. If you want to convert your version to run bash, go for it. This is the joy of open source. Regards
▸
Vernon
On Fri, Sep 10, 2010 at 2:03 AM, Jerald Sheets <user-96a6f34c5806@xymon.invalid> wrote:
Converting to BASH now... (AIX? HP/UX? Why ksh?) --j On Sep 9, 2010, at 1:49 PM, Steve Holmes wrote: On Thu, Sep 9, 2010 at 1:56 AM, Vernon Everett <user-b3f8dacb72c8@xymon.invalid>wrote:Hi all Just posted an iostat add-on, with graphing goodness to http://xymonton.trantor.org/doku.php/monitors:diskstat.ksh ... Regards Vernon Vernon, thanks for this add on. It looks useful. However, I have installed it per the instructions and on the client it appears to be doing the right thing (both client and server are Solaris 10 and I'm running Xymon4.2.3). Does it take a while for the graphs to fill in? I've been waiting for a couple of hours and they show up as broken graphs on the trends page. Not empty graphs, but broken. Does it matter that the SPLITNCV lines in hobbitgraph.cfg are the first ones I've added? Thanks, Steve --
list Vernon Everett
Hmmm. I did forget the graph definitions, and added them shortly afterwards. Did you get the graph definitions to add to hobbitgraph.cfg? I started getting lines appear after 15 minutes. (about what you would expect) Did you restart hobbit server? Remember, significant changes to hobbitserver.cfg do require a restart. Don't know if these changes are classified as significant enough, but I did a restart anyway. Let me know how you go.
▸
Regards
Vernon
On Fri, Sep 10, 2010 at 1:49 AM, Steve Holmes <user-ec1bf77b1b44@xymon.invalid> wrote:
On Thu, Sep 9, 2010 at 1:56 AM, Vernon Everett <user-b3f8dacb72c8@xymon.invalid>wrote:Hi all Just posted an iostat add-on, with graphing goodness to http://xymonton.trantor.org/doku.php/monitors:diskstat.ksh ... Regards Vernon Vernon, thanks for this add on. It looks useful. However, I have installed it per the instructions and on the client it appears to be doing the right thing (both client and server are Solaris 10 and I'm running Xymon4.2.3). Does it take a while for the graphs to fill in? I've been waiting for a couple of hours and they show up as broken graphs on the trends page. Not empty graphs, but broken. Does it matter that the SPLITNCV lines in hobbitgraph.cfg are the first ones I've added? Thanks, Steve --
list Steve Holmes
Wherever you go, there you are.
▸
On Sep 9, 2010, at 7:33 PM, Vernon Everett <user-b3f8dacb72c8@xymon.invalid> wrote:
Hmmm. I did forget the graph definitions, and added them shortly afterwards. Did you get the graph definitions to add to hobbitgraph.cfg? I started getting lines appear after 15 minutes. (about what you would expect)
Yes I got all of those definitions I think.
Did you restart hobbit server?
Yes, I did that, too. I won't get back to this for a few days, but I will double check. I am getting rrd files, if that makes any difference. Thanks, Steve
▸
Remember, significant changes to hobbitserver.cfg do require a restart.
Don't know if these changes are classified as significant enough, but I did a restart anyway.
Let me know how you go.
Regards
Vernon
On Fri, Sep 10, 2010 at 1:49 AM, Steve Holmes <user-ec1bf77b1b44@xymon.invalid> wrote:
On Thu, Sep 9, 2010 at 1:56 AM, Vernon Everett <user-b3f8dacb72c8@xymon.invalid> wrote:
Hi all
Just posted an iostat add-on, with graphing goodness to
http://xymonton.trantor.org/doku.php/monitors:diskstat.ksh
...
Regards
Vernon
Vernon, thanks for this add on. It looks useful. However, I have installed it per the instructions and on the client it appears to be doing the right thing (both client and server are Solaris 10 and I'm running Xymon 4.2.3).
Does it take a while for the graphs to fill in? I've been waiting for a couple of hours and they show up as broken graphs on the trends page. Not empty graphs, but broken. Does it matter that the SPLITNCV lines in hobbitgraph.cfg are the first ones I've added?
Thanks,
Steve
--
list Vernon Everett
Indeed, they eventually appeared. It's puzzling. Every so oftten, mails I post to the list just vanish. Sometimes, they appear later. They are in my sent items, but just don't appear on the list or the archive. I am trying to detect a pattern, but so far nothing. I have even investigated the highly likely possibility of it being a PEBKAC. But it doesn't appear to be. I will keep investigating.
▸
Cheers
Vernon
On Thu, Sep 9, 2010 at 9:05 PM, Ralph Mitchell <user-00a5e44c48c0@xymon.invalid>wrote:
I see 12 messages in the thread, the last being yours about 6 hours ago saying: "Added the hobbitgraph stuff." Ralph Mitchell On Thu, Sep 9, 2010 at 1:56 AM, Vernon Everett <user-b3f8dacb72c8@xymon.invalid>wrote:Hi all Just posted an iostat add-on, with graphing goodness to http://xymonton.trantor.org/doku.php/monitors:diskstat.ksh Something interesting, for perverse values of interesting, is myself and Roland Soderstrom were discussing it in the thread titled custom check / graphs with multiple rrd files. None of my last 3 posts to that thread have appeared in the list archive. This is not the first time this has happened to me. It's beginning to look that posts are dropping off threads if the thread gets too long. Is this a mailing list issue, or a Gmail issue? Anybody else with Gmail experience this? Regards Vernon
list Daniel Bourque
Sorry I can't reply to the thread for some reason i quit receiving the emails, I checked the archive and noticed the replies to my former thread. ( thanks ! )
Vernon, since I don't run solaris here, only linux and some tru64, the -r ( csv output ) and -n ( friendly names ) options makes it hard to use your shell script since they either don't exists or don't work the same. Can you perhaps provide a same output of "iostat -xrn" and along with formated text you pass to hobbit in your check.
I can then provide a snippet of code for linux, which would provide the equivalent output. So you could just add a case in the shell script.
case `uname` in
Linux)
/usr/bin/iostat -x $DURATION 2 | wonderful stuff > $TEMPFILE.raw
;;
SunOS)
/usr/bin/iostat -xrn $DURATION 2 > $TEMPFILE.raw
;;
esac
--
Dan
list Vernon Everett
Could give it a go.
Send me the output of iostat -x 2 2 for your favourite OS(s)
Where I am now, I only have Solaris, hence the bias.
Cheers
V
▸
On Wed, Sep 15, 2010 at 5:22 AM, Daniel Bourque <user-a141068964db@xymon.invalid>wrote:
Sorry I can't reply to the thread for some reason i quit receiving the
emails, I checked the archive and noticed the replies to my former thread. (
thanks ! )
Vernon, since I don't run solaris here, only linux and some tru64, the -r (
csv output ) and -n ( friendly names ) options makes it hard to use your
shell script since they either don't exists or don't work the same. Can you
perhaps provide a same output of "iostat -xrn" and along with formated text
you pass to hobbit in your check.
I can then provide a snippet of code for linux, which would provide the
equivalent output. So you could just add a case in the shell script.
case `uname` in
Linux)
/usr/bin/iostat -x $DURATION 2 | wonderful stuff > $TEMPFILE.raw
;;
SunOS)
/usr/bin/iostat -xrn $DURATION 2 > $TEMPFILE.raw
;;
esac
--
Dan
list Isaac W Traxler
Hi,
Jason and I have hacked at the iostat and got something that seems to work on Linux. We have not cleaned it up near enough or fixed all that we need. Along with changing the script, the graph definitions need to also be changed. Here is the what we have done with iostat so far:
#!/bin/bash
OS=$(uname -o)
PID=$$
if [[ ${OS} == "solaris" ]]
then
IOSTAT='/usr/bin/iostat -xrn'
else
IOSTAT='/usr/bin/iostat -x'
fi
TEMPFILE=${BBTMP}/diskstat.tmp.${PID}
SHOW_NFS=no # Set this to yes on server side clientlocal.cfg to change it
# DISKSTAT:SHOW_NFS=yes
DURATION=270 # The duration of the iostat sample
# This can be updated in the same way as above
# Now we redefine some variables, if they are set in clientlocal
LOGFETCH=${BBTMP}/logfetch.$(uname -n).cfg
if [ -f ${LOGFETCH} ]
then
grep "^DISKSTAT:" ${LOGFETCH} | cut -d":" -f2 \
| while read NEW_DEF
do
${NEW_DEF}
done
fi
${TEMPFILE} # Make sure it's emptyTEMPFILERAW="${TEMPFILE}.raw"
${IOSTAT} $DURATION 2 > ${TEMPFILERAW} # And collect some data to work with.
# We have to collect 2 sets, because the first set is the average since boot.
# Define where the second set of data starts
LINE=$(cat ${TEMPFILERAW} | grep -n "^Device:" | tail -1 | cut -d":" -f1)
# take the second set, and massage it into usable data
TEMPFILEDATA="${TEMPFILE}.data"
if [[ ${OS} == "solaris" ]]
then
cat ${TEMPFILERAW} | awk "NR>${LINE}" \
| sed "s/,/ /g" \
| awk '{ print $NF" "$0 }' \
| awk '{ $NF="";print }' > ${TEMPFILEDATA}
else
cat ${TEMPFILERAW} | awk "NR>${LINE}" \
| awk '{ print $0" "$1 }' \
| awk '{ $NF="";print }' > ${TEMPFILEDATA}
fi
rm ${TEMPFILERAW}
# Now we format the data and send it off to the server
if [[ ${OS} == "solaris" ]]
then
COLUMNS="reads writes kreads kwrites wait actv svct pw pb"
else
COLUMNS="rrqm wrqm r w rsec wsec avgrq-sz avgqu-sz await svctm util"
fi
count=1
for subtest in ${COLUMNS}
do
((count=count+1))
echo "" >> ${TEMPFILE}
cat ${TEMPFILEDATA} | cut -d" " -f1,${count} \
| while read DEVICE VAL
do
echo "${DEVICE}" | grep ":/" > /dev/null
if [ $? -eq 0 -a "${SHOW_NFS}" = "no" ]
then
break
else
DEVICE=$(echo ${DEVICE} | tr : - )
fi
echo "${DEVICE}:${VAL}" >> ${TEMPFILE}
done
echo "" >> ${TEMPFILE}
${BB} ${BBDISP} "data ${MACHINE}.diskstat-${subtest} $(echo; cat ${TEMPFILE} ;echo "" ;echo "ignore this" )"
# Without the last echo "ignore this", it seems to not graph the last entry.
# Odd really, but that seems to fix it.
rm ${TEMPFILE}
done
rm ${TEMPFILEDATA}
--
Isaac Traxler AIX,Linux Admin
Louisiana State University user-4dfb0dbf036e@xymon.invalid
High Performance Computing XXX-XXX-XXXX
LONI AIX Clusters
AIX, Linux Support
▸
On Tue, 14 Sep 2010, Daniel Bourque wrote:
Date: Tue, 14 Sep 2010 16:22:13 -0500
From: Daniel Bourque <user-a141068964db@xymon.invalid>
Reply-To: xymon at xymon.com
To: xymon at xymon.com
Subject: [xymon] iostat monitor
Sorry I can't reply to the thread for some reason i quit receiving the emails, I checked the archive and noticed the replies to my former thread. ( thanks ! )
Vernon, since I don't run solaris here, only linux and some tru64, the -r ( csv output ) and -n ( friendly names ) options makes it hard to use your shell script since they either don't exists or don't work the same. Can you perhaps provide a same output of "iostat -xrn" and along with formated text you pass to hobbit in your check.
I can then provide a snippet of code for linux, which would provide the equivalent output. So you could just add a case in the shell script.
case `uname` in
Linux)
/usr/bin/iostat -x $DURATION 2 | wonderful stuff > $TEMPFILE.raw
;;
SunOS)
/usr/bin/iostat -xrn $DURATION 2 > $TEMPFILE.raw
;;
esac
--
Dan
list Vernon Everett
If everybody who has done similar hacks can send me their magic, then maybe we can fold it into the code, and create a more universal script. We will just have to get past the ksh/bash difference of opinion. :-)
▸
Cheers
V
On Wed, Sep 15, 2010 at 6:19 AM, Isaac W Traxler <user-4dfb0dbf036e@xymon.invalid> wrote:
Hi, Jason and I have hacked at the iostat and got something that seems to work on Linux. We have not cleaned it up near enough or fixed all that we need. Along with changing the script, the graph definitions need to also be changed. Here is the what we have done with iostat so far: #!/bin/bash OS=$(uname -o) PID=$$ if [[ ${OS} == "solaris" ]] then IOSTAT='/usr/bin/iostat -xrn' else IOSTAT='/usr/bin/iostat -x' fi TEMPFILE=${BBTMP}/diskstat.tmp.${PID} SHOW_NFS=no # Set this to yes on server side clientlocal.cfg to change it # DISKSTAT:SHOW_NFS=yes DURATION=270 # The duration of the iostat sample # This can be updated in the same way as above # Now we redefine some variables, if they are set in clientlocal LOGFETCH=${BBTMP}/logfetch.$(uname -n).cfg if [ -f ${LOGFETCH} ] then grep "^DISKSTAT:" ${LOGFETCH} | cut -d":" -f2 \ | while read NEW_DEF do ${NEW_DEF} done fi ${TEMPFILE} # Make sure it's emptyTEMPFILERAW="${TEMPFILE}.raw" ${IOSTAT} $DURATION 2 > ${TEMPFILERAW} # And collect some data to work with. # We have to collect 2 sets, because the first set is the average since boot. # Define where the second set of data starts LINE=$(cat ${TEMPFILERAW} | grep -n "^Device:" | tail -1 | cut -d":" -f1) # take the second set, and massage it into usable data TEMPFILEDATA="${TEMPFILE}.data" if [[ ${OS} == "solaris" ]] then cat ${TEMPFILERAW} | awk "NR>${LINE}" \ | sed "s/,/ /g" \ | awk '{ print $NF" "$0 }' \ | awk '{ $NF="";print }' > ${TEMPFILEDATA} else cat ${TEMPFILERAW} | awk "NR>${LINE}" \ | awk '{ print $0" "$1 }' \ | awk '{ $NF="";print }' > ${TEMPFILEDATA} fi rm ${TEMPFILERAW} # Now we format the data and send it off to the server if [[ ${OS} == "solaris" ]] then COLUMNS="reads writes kreads kwrites wait actv svct pw pb" else COLUMNS="rrqm wrqm r w rsec wsec avgrq-sz avgqu-sz await svctm util" fi count=1 for subtest in ${COLUMNS} do ((count=count+1)) echo "" >> ${TEMPFILE} cat ${TEMPFILEDATA} | cut -d" " -f1,${count} \ | while read DEVICE VAL do echo "${DEVICE}" | grep ":/" > /dev/null if [ $? -eq 0 -a "${SHOW_NFS}" = "no" ] then break else DEVICE=$(echo ${DEVICE} | tr : - ) fi echo "${DEVICE}:${VAL}" >> ${TEMPFILE} done echo "" >> ${TEMPFILE} ${BB} ${BBDISP} "data ${MACHINE}.diskstat-${subtest} $(echo; cat ${TEMPFILE} ;echo "" ;echo "ignore this" )" # Without the last echo "ignore this", it seems to not graph the last entry. # Odd really, but that seems to fix it. rm ${TEMPFILE} done rm ${TEMPFILEDATA} -- Isaac Traxler AIX,Linux Admin Louisiana State University user-4dfb0dbf036e@xymon.invalid High Performance Computing XXX-XXX-XXXX LONI AIX Clusters AIX, Linux Support On Tue, 14 Sep 2010, Daniel Bourque wrote: Date: Tue, 14 Sep 2010 16:22:13 -0500From: Daniel Bourque <user-a141068964db@xymon.invalid> Reply-To: xymon at xymon.com To: xymon at xymon.com Subject: [xymon] iostat monitor Sorry I can't reply to the thread for some reason i quit receiving the emails, I checked the archive and noticed the replies to my former thread. ( thanks ! ) Vernon, since I don't run solaris here, only linux and some tru64, the -r ( csv output ) and -n ( friendly names ) options makes it hard to use your shell script since they either don't exists or don't work the same. Can you perhaps provide a same output of "iostat -xrn" and along with formated text you pass to hobbit in your check. I can then provide a snippet of code for linux, which would provide the equivalent output. So you could just add a case in the shell script. case `uname` in Linux) /usr/bin/iostat -x $DURATION 2 | wonderful stuff > $TEMPFILE.raw ;; SunOS) /usr/bin/iostat -xrn $DURATION 2 > $TEMPFILE.raw ;; esac -- Dan
list Daniel Bourque
here is a sample run. $ iostat -xd 5 2 Linux 2.6.18-92.1.22.el5PAE (host.bla.com) 09/15/2010 Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %util sda 0.00 0.00 0.00 0.00 0.07 0.00 92.89 0.00 4.31 3.78 0.00 sdb 23.63 8.98 21.65 27.30 14.39 120.96 2.77 0.10 1.95 1.55 7.59 dm-0 0.00 0.00 0.11 3.01 2.27 24.08 8.44 0.04 11.50 1.04 0.32 dm-1 0.00 0.00 0.34 1.80 6.38 14.38 9.71 0.02 10.15 1.59 0.34 dm-2 0.00 0.00 0.00 0.33 0.01 2.62 8.00 0.00 7.70 3.09 0.10 dm-3 0.00 0.00 0.47 28.77 33.15 90.77 4.24 0.06 2.03 0.17 0.51 dm-4 0.00 0.00 0.00 0.00 0.00 0.00 8.00 0.00 18.27 2.54 0.00 dm-5 0.00 0.00 44.45 2.42 111.93 123.38 5.02 0.02 0.37 1.59 7.45 drbd0 0.00 0.00 44.45 128.36 111.89 51.42 4.88 0.10 3.45 3.94 13.17 Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %util sda 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 sdb 0.00 35.40 3.80 69.20 32.00 821.20 11.69 0.04 0.59 0.52 3.76 dm-0 0.00 0.00 0.20 2.80 3.20 22.40 8.53 0.00 1.13 1.13 0.34 dm-1 0.00 0.00 0.00 1.80 0.00 14.40 8.00 0.00 0.00 0.00 0.00 dm-2 0.00 0.00 0.00 0.60 0.00 4.80 8.00 0.00 0.00 0.00 0.00 dm-3 0.00 0.00 0.00 2.00 0.00 16.00 8.00 0.00 0.00 0.00 0.00 dm-4 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 dm-5 0.00 0.00 3.60 97.40 28.80 761.00 7.82 0.04 0.40 0.34 3.40 drbd0 0.00 0.00 3.60 94.80 28.80 758.40 8.00 0.09 0.96 0.56 5.48 I whish the maintainer of iostat would add a friendly name option, it would not be so hard to code a device-mapper -> LVM translation using "dmsetup ls" , problem is you can only run that command as root. ( guess you could add a sudo rule... ) anyways, if you don't find the time to work on this, just just provide a sample output of solaris's /usr/bin/iostat -xrn and I'll post the changes needed to the list. Thank ! Dan
▸
Vernon Everett wrote:Could give it a go.
Send me the output of iostat -x 2 2 for your favourite OS(s)
Where I am now, I only have Solaris, hence the bias.
Cheers
V
On Wed, Sep 15, 2010 at 5:22 AM, Daniel Bourque <user-a141068964db@xymon.invalid <mailto:user-a141068964db@xymon.invalid>> wrote:
Sorry I can't reply to the thread for some reason i quit receiving
the emails, I checked the archive and noticed the replies to my
former thread. ( thanks ! )
Vernon, since I don't run solaris here, only linux and some tru64,
the -r ( csv output ) and -n ( friendly names ) options makes it
hard to use your shell script since they either don't exists or
don't work the same. Can you perhaps provide a same output of
"iostat -xrn" and along with formated text you pass to hobbit in
your check.
I can then provide a snippet of code for linux, which would
provide the equivalent output. So you could just add a case in the
shell script.
case `uname` in
Linux)
/usr/bin/iostat -x $DURATION 2 | wonderful stuff > $TEMPFILE.raw
;;
SunOS)
/usr/bin/iostat -xrn $DURATION 2 > $TEMPFILE.raw
;;
esac
-- Dan
xymon-unsubscribe at xymon.com <mailto:xymon-unsubscribe at xymon.com>
list Vernon Everett
Hi Daniel
This is significantly different to the Solaris output. Linux gives a few
extra fields, and leaves out a few that Solaris has.
We may need to play with this to find some common ground.
Alternatively, we just accept and embrace the differences, and add a few
extra graphs.
We then set a SUBTEST string within the case statement.
The loop will then be
for subtest in $SUBTEST
The output I used was as below.
Make yours look similar, and we have a winner.
The output then gets massaged a little with this set of commands (commented
for clarity)
cat $TEMPFILE.raw | awk "NR>$LINE" \ # take only the last set
| sed "s/,/ /g" \ # make it space seperated
| awk '{ print $NF" "$0 }' \ # move device name to front
| awk '{ $NF="";print }' > $TEMPFILE.data # Dump the
device name at the end
iostat -xrn 2 2
extended device statistics
r/s,w/s,kr/s,kw/s,wait,actv,wsvc_t,asvc_t,%w,%b,device
1.4,3.9,93.0,68.8,0.0,0.2,0.0,47.0,0,3,vdc0
0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.2,0,0,vdc1
1.4,4.0,92.8,68.8,0.0,0.3,0.0,61.0,0,4,vdc2
0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0,0,blackbox:/data/scratch
0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0,0,infdomB1:/export/DRhome
0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.7,0,0,infdomB1:/export/home
0.0,0.0,0.0,0.0,0.0,0.0,0.0,9.1,0,0,infdomD2:/data/software
extended device statistics
r/s,w/s,kr/s,kw/s,wait,actv,wsvc_t,asvc_t,%w,%b,device
0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0,0,vdc0
0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0,0,vdc1
0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0,0,vdc2
0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0,0,blackbox:/data/scratch
0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0,0,infdomB1:/export/DRhome
0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0,0,infdomB1:/export/home
0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0,0,infdomD2:/data/software
From man iostat
-n Display names in descriptive format. For exam-
ple, cXtYdZ, rmt/N, server:/export/path.
-r Display data in a comma-separated format.
-x Report extended disk statistics.
Output
The output of the iostat utility includes the following
information.
device name of the disk
r/s reads per second
w/s writes per second
kr/s kilobytes read per second
The average I/O size during the interval can be
computed from kr/s divided by r/s.
kw/s kilobytes written per second
The average I/O size during the interval can be
computed from kw/s divided by w/s.
wait average number of transactions waiting for service
(queue length)
This is the number of I/O operations held in the
device driver queue waiting for acceptance by the
device.
actv average number of transactions actively being ser-
viced (removed from the queue but not yet com-
pleted)
This is the number of I/O operations accepted, but
not yet serviced, by the device.
svc_t average response time of transactions, in mil-
liseconds
The svc_t output reports the overall response
time, rather than the service time, of a device.
The overall time includes the time that transac-
tions are in queue and the time that transactions
are being serviced. The time spent in queue is
shown with the -x option in the wsvc_t output
column. The time spent servicing transactions is
the true service time. Service time is also shown
with the -x option and appears in the asvc_t out-
put column of the same report.
%w percent of time there are transactions waiting for
service (queue non-empty)
%b percent of time the disk is busy (transactions in
progress)
wsvc_t average service time in wait queue, in mil-
liseconds
asvc_t average service time of active transactions, in
milliseconds
wt the I/O wait time is no longer calculated as a
percentage of CPU time, and this statistic will
always return zero.
Cheers
Vernon
On Wed, Sep 15, 2010 at 10:27 PM, Daniel Bourque
▸
<user-a141068964db@xymon.invalid>wrote:
here is a sample run. $ iostat -xd 5 2 Linux 2.6.18-92.1.22.el5PAE (host.bla.com) 09/15/2010 Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %util sda 0.00 0.00 0.00 0.00 0.07 0.00 92.89 0.00 4.31 3.78 0.00 sdb 23.63 8.98 21.65 27.30 14.39 120.96 2.77 0.10 1.95 1.55 7.59 dm-0 0.00 0.00 0.11 3.01 2.27 24.08 8.44 0.04 11.50 1.04 0.32 dm-1 0.00 0.00 0.34 1.80 6.38 14.38 9.71 0.02 10.15 1.59 0.34 dm-2 0.00 0.00 0.00 0.33 0.01 2.62 8.00 0.00 7.70 3.09 0.10 dm-3 0.00 0.00 0.47 28.77 33.15 90.77 4.24 0.06 2.03 0.17 0.51 dm-4 0.00 0.00 0.00 0.00 0.00 0.00 8.00 0.00 18.27 2.54 0.00 dm-5 0.00 0.00 44.45 2.42 111.93 123.38 5.02 0.02 0.37 1.59 7.45 drbd0 0.00 0.00 44.45 128.36 111.89 51.42 4.88 0.10 3.45 3.94 13.17 Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %util sda 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 sdb 0.00 35.40 3.80 69.20 32.00 821.20 11.69 0.04 0.59 0.52 3.76 dm-0 0.00 0.00 0.20 2.80 3.20 22.40 8.53 0.00 1.13 1.13 0.34 dm-1 0.00 0.00 0.00 1.80 0.00 14.40 8.00 0.00 0.00 0.00 0.00 dm-2 0.00 0.00 0.00 0.60 0.00 4.80 8.00 0.00 0.00 0.00 0.00 dm-3 0.00 0.00 0.00 2.00 0.00 16.00 8.00 0.00 0.00 0.00 0.00 dm-4 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 dm-5 0.00 0.00 3.60 97.40 28.80 761.00 7.82 0.04 0.40 0.34 3.40 drbd0 0.00 0.00 3.60 94.80 28.80 758.40 8.00 0.09 0.96 0.56 5.48 I whish the maintainer of iostat would add a friendly name option, it would not be so hard to code a device-mapper -> LVM translation using "dmsetup ls" , problem is you can only run that command as root. ( guess you could add a sudo rule... ) anyways, if you don't find the time to work on this, just just provide a sample output of solaris's /usr/bin/iostat -xrn and I'll post the changes needed to the list. Thank ! Dan Vernon Everett wrote:Could give it a go. Send me the output of iostat -x 2 2 for your favourite OS(s) Where I am now, I only have Solaris, hence the bias. Cheers V
On Wed, Sep 15, 2010 at 5:22 AM, Daniel Bourque <user-a141068964db@xymon.invalid<mailto:
▸
user-a141068964db@xymon.invalid>> wrote: Sorry I can't reply to the thread for some reason i quit receiving the emails, I checked the archive and noticed the replies to my former thread. ( thanks ! ) Vernon, since I don't run solaris here, only linux and some tru64, the -r ( csv output ) and -n ( friendly names ) options makes it hard to use your shell script since they either don't exists or don't work the same. Can you perhaps provide a same output of "iostat -xrn" and along with formated text you pass to hobbit in your check. I can then provide a snippet of code for linux, which would provide the equivalent output. So you could just add a case in the shell script. case `uname` in Linux) /usr/bin/iostat -x $DURATION 2 | wonderful stuff > $TEMPFILE.raw ;; SunOS) /usr/bin/iostat -xrn $DURATION 2 > $TEMPFILE.raw ;; esac -- Dan xymon-unsubscribe at xymon.com <mailto:xymon-unsubscribe at xymon.com>
list Buchan Milne
▸
On Thursday, 9 September 2010 06:56:25 Vernon Everett wrote:
Hi all Just posted an iostat add-on, with graphing goodness to http://xymonton.trantor.org/doku.php/monitors:diskstat.ksh Something interesting, for perverse values of interesting, is myself and Roland Soderstrom were discussing it in the thread titled custom check / graphs with multiple rrd files.
In devmon, there is a linux-netsnmp template with a diskio check, which works for Linux hosts running net-snmp. If the RRD details are the same, it would be nice to keep things consistent ... BTW, are you just using this for trends, or are you alerting on it? Regards, Buchan