Xymon Mailing List Archive search

UPS monitoring using devmon

list Thorsten Erdmann
Fri, 11 Sep 2009 10:56:00 +0200
Message-Id: <user-a6ecfdf536b1@xymon.invalid>

Hi David

I just tested your template against my MGE Galaxy. There are some minor 
issues:

1.  battery voltage is displayed as 42100V, yeah. I solved that with 
setting the transform of "upsBatteryVoltageTxt" to "{upsBatteryVoltage} / 
10" instead of "x 10"
2.  battery current is 0A, it seems that the MGE does not deliver this 
value
3.  message test is yellow: "primary OID upsAlarmDescrTxt in table is a 
non-repeater", I don't know how to solve that
4.  output power is 0kW. If I use the special MGE MIB I get real results
5.  input power displays unrealistic results:
         Line bads: 3
         Input configuration: 3 lines of 0 V AC @ 0.0 Hz

         UPS Input
         Phase Freq Volts Amps Power 
        1 50.0 Hz  390V 28.0A 2746W 
        2 50.0 Hz  395V 28.0A 2835W 
        3 50.0 Hz  391V 28.0A 2765W 
6.  I am missing some temperature tests

So now I am unsure what to do.
I can ignore these issues
I can modify your template to use the OIDs of the vendor's MIB
I can use my own template which does not display so much results, but uses 
the vendor's MIB

It seems that the vendors does not support the standard UPS MIB very 
cleanly. So I think we have to build a special template for every vendor. 
:-(

But even if I don't use your template it was very helpful. It is a very 
good demonstration how to use tables and other stuff in devmon. I didn't 
try the graphing now, but I will do it now. :-)

Thank you for your work!!!

Thorsten

user-cbbf693f2c89@xymon.invalid schrieb am 11.09.2009 07:26:30:
My UPS template has been tweaked and is now available from the devmon SF
repository:
http://devmon.svn.sourceforge.net/viewvc/devmon/trunk/templates/ups.
tar.gz?view=tar

README file includes hobbitgraph.cfg section and info on setting up RRD
collection and graphing.

Let me know if any problems. The msgs test attempts to show the
AlarmDesc repeater which is empty unless there are any alarms which may
cause issues - I've increased the number of "not found" repeaters
slightly in my copy of devmon to be more lenient. The line is in
modules/dm_snmp.pm

  if($failed_query > 2) {

I increased 2 to 6.

your example was very helpfull. I now managed to get a working devmon
template for my MGE UPS. Only graphing isn't working now.

BTW.: anyone knows how I can format output numbers in devmon. I now
get the voltages with two digits behind the colon, but these are
always zero. So I want to omit these digits at all.
The number of decimal digits displayed is controlled in the transforms
file with MATH operator by adding ": N" - e.g.

upsInputCurrentA      : MATH  : {upsInputCurrent} / 10 : 1

will display for example 10.1 - change to ": 0" to get 10

David.
Thorsten Erdmann


user-cbbf693f2c89@xymon.invalid schrieb am 10.09.2009 09:51:49:
Sorry all, didn't get time today to update the UPS template today.

It has separate graphs for input voltage, output phase voltages, 
battery
levels and separate tests/alerting on various conditions - input
voltage=0, battery time remaining <15 mins, output phase overload, 
etc.

David.
Oops, forgot to add the graphing bit.

Usual stuff.
In hobbitserver.cfg
Add "ups=ncv" to TEST2RRD=
Add "ups" to GRAPHS=
Add line NCV_ups="Load:GAUGE,Charge:GAUGE"

Add this to hobbitgraph.cfg
[ups]
        TITLE UPS Charge
        YAXIS Power
        -u 100
        -l 0
        DEF:u=ups.rrd:Charge:AVERAGE
        DEF:p=ups.rrd:Load:AVERAGE
        LINE2:u#00CC00:Charge
        LINE2:p#0000FF:Load
        COMMENT:\n
        GPRINT:u:LAST:Charge   \: %5.1lf%s (cur)
        GPRINT:u:MAX: \: %5.1lf%s (max)
        GPRINT:u:MIN: \: %5.1lf%s (min)
        GPRINT:u:AVERAGE: \: %5.1lf%s (avg)\n
        GPRINT:p:LAST:Load   \: %5.1lf%s (cur)
        GPRINT:p:MAX: \: %5.1lf%s (max)
        GPRINT:p:MIN: \: %5.1lf%s (min)
        GPRINT:p:AVERAGE: \: %5.1lf%s (avg)\n
Cheers
     Vernon


*From:* Everett, Vernon [mailto:user-9da1a1882f49@xymon.invalid]
*Sent:* Thursday, 10 September 2009 2:50 PM
*To:* 'user-ae9b8668bcde@xymon.invalid'
*Subject:* RE: [hobbit] Antwort: RE: [hobbit] UPS monitoring using
devmon

I wrote this a while back for our MGEs.
As you can see, it predates my introduction to devmon and 
indicates a
complete lack of understanding of SNMP.
That being said, the code and the MIBs might give you a good
indication of where to start.
(Either that, or you can just use it as "good enough")
This was designed to run on the hobbit/xymon server.

I have been meaning to rewrite this, or move it to devmon, but 
just
haven't had the time.
(And it's doing an adequate job for now.)

Cheers
      V

bb-host entries look like this
1.2.3.4     galaxy3000    # http://1.2.3.4/ups_prop.htm ups 
galaxy3000
COMMENT:"Meaningful Comment if required"
2.3.4.5   Karratha_UPS    # http://2.3.4.5/ups_prop.htm ups 
galaxy3000
COMMENT:"Insert Comment"


--- snip---

cat ups.ksh
#!/bin/ksh
DATE=$(date)
#set -x
SPACER=" 
"
BBTMP=/tmp
#BBHOSTS=/etc/hobbit/bb-hosts
#OUT=$BBTMP/upspage
BBHOSTLIST="$BBHOSTS $(grep ^include $BBHOSTS | awk '{ print $2 
}')" #
Make sure we read the include files too
#grep -h " ups " $BBHOSTLIST | egrep -v "^page|^include"
grep -h " ups " $BBHOSTLIST | egrep -v "^page|^include" | while 
read
IP UPSNAME HASH URL UPS TYPE OTHER
do
   echo $IP $UPSNAME
   ping -c1 $IP > /dev/null
   if [ $? -eq 0 ]
   then
      COLOUR=green
      case $TYPE in
         galaxy3000)
$OUT.warn
$OUT.tmp
            TEMP=$(snmpget -v1 -c public $IP
SNMPv2-SMI::mib-2.33.1.1.1.0)
            DEVICE=${TEMP##*:}       # String
            TEMP=$(snmpget -v1 -c public $IP
SNMPv2-SMI::mib-2.33.1.1.2.0)
            MODEL=${TEMP##*:}        # String
            TEMP=$(snmpget -v1 -c public $IP
SNMPv2-MIB::sysLocation.0)
            LOCATION=${TEMP##*:}             # String
            TEMP=$(snmpget -v1 -c public $IP
SNMPv2-SMI::mib-2.33.1.1.4.0)
            SERIAL=${TEMP##*:}               # String
            TEMP=$(snmpget -v1 -c public $IP 
1.3.6.1.4.1.705.1.4.8.0)
            LOWBATTERY=${TEMP##*:}           # Integer % Point at
which shutdown triggered
            TEMP=$(snmpget -v1 -c public $IP 
1.3.6.1.4.1.705.1.5.1.0)
            TIME_REMAIN=${TEMP##*:}     # Integer seconds
            TEMP=$(snmpget -v1 -c public $IP 
1.3.6.1.4.1.705.1.5.2.0)
            BATTERY_LEVEL=${TEMP##*:}   # Integer %
            TEMP=$(snmpget -v1 -c public $IP 
1.3.6.1.4.1.705.1.5.9.0)
            BATTERY_FAULT=${TEMP##*:}   # Integer 1=yes 2=no
            TEMP=$(snmpget -v1 -c public $IP 
1.3.6.1.4.1.705.1.5.11.0)
            BATTERY_REPLACE=${TEMP##*:} # Integer 1=yes 2=no
            TEMP=$(snmpget -v1 -c public $IP 
1.3.6.1.4.1.705.1.5.15.0)
            CHARGER_FAULT=${TEMP##*:}   # Integer 1=yes 2=no
            TEMP=$(snmpget -v1 -c public $IP 
1.3.6.1.4.1.705.1.7.3.0)
            OUT_ON_BAT=${TEMP##*:}      # Integer 1=yes 2=no
            TEMP=$(snmpget -v1 -c public $IP 
1.3.6.1.4.1.705.1.7.4.0)
            OUT_ON_BYPASS=${TEMP##*:}   # Integer 1=yes 2=no
            TEMP=$(snmpget -v1 -c public $IP 
1.3.6.1.4.1.705.1.7.10.0)
            OUT_OVERLOAD=${TEMP##*:}    # Integer 1=yes 2=no
            TEMP=$(snmpget -v1 -c public $IP 
1.3.6.1.4.1.705.1.7.10.0)
            COMMSOK=${TEMP##*:}         # Integer 1=yes 2=no
            TEMP=$(snmpget -v1 -c public $IP 
1.3.6.1.4.1.705.1.6.1.0)
            INPHASES=${TEMP##*:}        # Integer 1 or 3

            TEMP=$(snmpget -v1 -c public $IP
1.3.6.1.4.1.705.1.6.2.1.2)
            INVOLT=${TEMP##*:}          # Integer 10ths of a volt

            TEMP=$(snmpget -v1 -c public $IP
1.3.6.1.4.1.705.1.6.2.1.3)
            INFREQ=${TEMP##*:}          # Integer 10ths of a Hertz

            TEMP=$(snmpget -v1 -c public $IP
1.3.6.1.4.1.705.1.6.2.1.6)
            INAMPS=${TEMP##*:}          # Integer 10ths of an Amp

            TEMP=$(snmpget -v1 -c public $IP 
1.3.6.1.4.1.705.1.6.3.0)
            INOK=${TEMP##*:}            # Integer 1=yes 2=no
            TEMP=$(snmpget -v1 -c public $IP 
1.3.6.1.4.1.705.1.6.4.0)
            INFAILCAUSE=${TEMP##*:}     # Integer 1=no fault
                                     #         2=bad voltage
                                     #         3=bad frequency
                                     #         4=no voltage
            TEMP=$(snmpget -v1 -c public $IP 
1.3.6.1.4.1.705.1.7.1.0)
            OUTPHASES=${TEMP##*:}       # Integer 1 or 3
            TEMP=$(snmpget -v1 -c public $IP
1.3.6.1.4.1.705.1.7.2.1.2)
            OUTVOLT=${TEMP##*:}         # Integer 10ths of a volt
            TEMP=$(snmpget -v1 -c public $IP
1.3.6.1.4.1.705.1.7.2.1.3)
            OUTFREQ=${TEMP##*:}         # Integer 10ths of a Hertz
            TEMP=$(snmpget -v1 -c public $IP
1.3.6.1.4.1.705.1.7.2.1.4)
            OUTLOAD=${TEMP##*:}         # Integer %
            TEMP=$(snmpget -v1 -c public $IP
1.3.6.1.4.1.705.1.7.2.1.5)
            OUTAMPS=${TEMP##*:}         # Integer 10ths of an Amp
            TEMP=$(snmpget -v1 -c public $IP 
1.3.6.1.4.1.705.1.7.11.0)
            OVERTEMP=${TEMP##*:}        # Integer 1=yes 2=no
            echo "Manufacturer                          "$DEVICE | 
sed
's/"//g' >> $OUT.tmp
            echo "Model                                "$MODEL | 
sed
's/"//g' >> $OUT.tmp
            echo "Serial Number                        "$SERIAL | 
sed
's/"//g' >> $OUT.tmp
            echo "Location                             "$LOCATION 
|
sed 's/"//g' >> $OUT.tmp
            echo >> $OUT.tmp
            #$LOWBATTERY
            #WARN=""
            #[ -z "$TIME_REMAIN" ] && TIME_REMAIN=0
            #[ $TIME_REMAIN -lt 1200 -a "$COLOUR" != "red" ] &&
COLOUR=yellow
            #[ $TIME_REMAIN -lt 1200 ] && WARN="Time Remaining 
low"
            #[ $TIME_REMAIN -lt 600 ] && COLOUR=red
            #[ $TIME_REMAIN -lt 600 ] && WARN="Time Remaining
critical"
            #[ "$WARN" != "" ] && echo $WARN >> $OUT.warn
            ((s=$TIME_REMAIN%60))
            ((m=$TIME_REMAIN/60))
            echo "Time Remaining                        "$m 
Minutes $s
Seconds >> $OUT.tmp
            WARN=""
            [ -z "$BATTERY_LEVEL" ] && BATTERY_LEVEL=0
            [ $BATTERY_LEVEL -lt 95 -a "$COLOUR" != "red" ] &&
COLOUR=yellow
            [ $BATTERY_LEVEL -lt 95 ] && WARN="Battery level low"
            [ $BATTERY_LEVEL -lt 50 ] && COLOUR=red
            [ $BATTERY_LEVEL -lt 50 ] && WARN="Battery level 
critical"
            [ "$WARN" != "" ] && echo $WARN >> $OUT.warn
            echo "Battery Level "$BATTERY_LEVEL
% >> $OUT.tmp
            BATTERY_LEVEL=$(echo $BATTERY_LEVEL | sed 's/[ 
\t]*//')
            if [ $BATTERY_FAULT -eq 1 ]
            then
               BF=Yes
               COLOUR=red
               echo "Battery Fault!" >> $OUT.warn
            else
               BF=No
            fi
            echo "Battery Fault                         "$BF >>
$OUT.tmp
            if [ $BATTERY_REPLACE -eq 1 ]
            then
               BR=Yes
               COLOUR=red
               echo "Battery replacement required" >> $OUT.warn
            else
               BR=No
            fi
            echo "Replace Battery                       "$BR >>
$OUT.tmp
            if [ $BATTERY_FAULT -eq 1 ]
            then
               BR=Yes
               COLOUR=red
               echo "Battery replacement required" >> $OUT.warn
            else
               BR=No
            fi
            if [ $CHARGER_FAULT -eq 1 ]
            then
               CF=Yes
               COLOUR=red
               echo "Charger Fault" >> $OUT.warn
            else
               CF=No
            fi
            echo "Charger Fault                         "$CF >>
$OUT.tmp
            if [ $OUT_ON_BAT -eq 1 ]
            then
               OUT_ON_BAT=Yes
               COLOUR=red
               echo "UPS running on battery" >> $OUT.warn
            else
               OUT_ON_BAT=No
            fi
            echo "On Battery "$OUT_ON_BAT
$OUT.tmp
            if [ $OUT_ON_BYPASS -eq 1 ]
            then
               OUT_ON_BYPASS=Yes
               COLOUR=red
               echo "UPS on power bypass" >> $OUT.warn
            else
               OUT_ON_BYPASS=No
            fi
            echo "On Bypass 
"$OUT_ON_BYPASS >> $OUT.tmp
            if [ $OUT_OVERLOAD -eq 1 ]
            then
               OUT_OVERLOAD=Yes
               COLOUR=red
               echo "UPS output overload" >> $OUT.warn
            else
               OUT_OVERLOAD=No
            fi
            echo "Battery Overload "$OUT_OVERLOAD
$OUT.tmp
            if [ $OVERTEMP -eq 1 ]
            then
               OVERTEMP=Yes
               COLOUR=red
               echo "Unit overheating" >> $OUT.warn
            else
               OVERTEMP=No
            fi
            echo "Unit Overheating                      "$OVERTEMP 
$OUT.tmp
            #if [ $COMMSOK -eq 2 ]
            #then
            #   COMMSOK=No
            #   COLOUR=red
            #   echo "No comms from device" >> $OUT.warn
            #else
            #   COMMSOK=Yes
            #fi
            #echo "Comms OK                              "$COMMSOK 
$OUT.tmp
            echo >> $OUT.tmp
            echo "Input Phases                         "$INPHASES 
$OUT.tmp
            INVOLT=$(echo "scale=1 ; $INVOLT/10" | bc)
            echo "Input Voltage                         "$INVOLT 
$OUT.tmp
            INFREQ=$(echo "scale=1 ; $INFREQ/10" | bc)
            echo "Input Frequency                       "$INFREQ 
$OUT.tmp
            INAMPS=$(echo "scale=1 ; $INAMPS/10" | bc)
            echo "Input Current                         "$INAMPS 
$OUT.tmp
            if [ $INOK -eq 1 ]
            then
               # A silly case of reverse logic applies here
               INOK=No
               COLOUR=red
               echo "Power input outside tollerance" >> $OUT.warn
            else
               INOK=Yes
            fi
            echo "Input OK                              "$INOK >>
$OUT.tmp
            [ $INFAILCAUSE -eq 1 ] && FAILCAUSE="No failures"
            [ $INFAILCAUSE -eq 2 ] && FAILCAUSE="Voltage out of
tollearance"
            [ $INFAILCAUSE -eq 3 ] && FAILCAUSE="Frequency out of
tollernace"
            [ $INFAILCAUSE -eq 4 ] && FAILCAUSE="No voltage -
power fail"
            echo "Cause of Failure "$FAILCAUSE >>
$OUT.tmp
            echo >> $OUT.tmp
            echo "Output Phases                        "$OUTPHASES 
$OUT.tmp
            OUTVOLT=$(echo "scale=1 ; $OUTVOLT/10" | bc)
            echo "Output Voltage                        "$OUTVOLT 
$OUT.tmp
            OUTFREQ=$(echo "scale=1 ; $OUTFREQ/10" | bc)
            echo "Output Frequency                      "$OUTFREQ 
$OUT.tmp
            OUTAMPS=$(echo "scale=1 ; $OUTAMPS/10" | bc)
            echo "Output Current                        "$OUTAMPS 
$OUT.tmp
            OUTLOAD=$(echo $OUTLOAD | sed 's/[ \t]*//')
            echo "Output Load                           "$OUTLOAD 
% >>
$OUT.tmp
            echo >> $OUT.final
            cat $OUT.warn >> $OUT.final
            cat $OUT.tmp >> $OUT.final
            echo '<FONT COLOR="Black">' >> $OUT.final
            echo "Load=$OUTLOAD" >> $OUT.final
            echo "Charge=$BATTERY_LEVEL" >> $OUT.final
            echo '</FONT>' >> $OUT.final
            rm $OUT.tmp
            rm $OUT.warn
       esac
    else
       echo "Device Unreachable!" >> $OUT.final
       #COLOUR=yellow
    fi
    $BB $BBDISP "status $UPSNAME.ups $COLOUR $DATE $(cat 
$OUT.final)"
    rm $OUT.final
done

--- snip ---


*From:* user-9219fb9415b1@xymon.invalid
[mailto:user-9219fb9415b1@xymon.invalid]
*Sent:* Thursday, 10 September 2009 2:18 PM
*To:* user-ae9b8668bcde@xymon.invalid
*Subject:* [hobbit] Antwort: RE: [hobbit] UPS monitoring using 
devmon


Hi Craig,
hi David,

sounds very nice.

I tried that snmpwalk. It only works with snmp V1. This is the 
output:

SNMPv2-SMI::mib-2.33.1.1.1.0 = STRING: "MGE UPS SYSTEMS"
SNMPv2-SMI::mib-2.33.1.1.2.0 = STRING: "Galaxy PW Single//"
SNMPv2-SMI::mib-2.33.1.1.3.0 = ""
SNMPv2-SMI::mib-2.33.1.1.4.0 = STRING: "GB (SN 49EE49044)"
SNMPv2-SMI::mib-2.33.1.1.5.0 = ""
SNMPv2-SMI::mib-2.33.1.1.6.0 = ""

As I see it, this is only the description of the UPS. I will look 
for
the MIB of my devices.
Don't know where all the other things which I want to monitor at
minimum are located in the MIB like:

Overall status or am I runnung on battery
output power
remaining battery time
temperature
I'll look at uploading it to the SF repository.
Maybe this is a stupid question, but I don't find templates and 
things
on the SF page. Only devmon itself.
I look at http://devmon.sourceforge.net/ and at
http://sourceforge.net/projects/devmon/
There is a devmon-template file there, but as I see it, it 
contains
only the templates which are delivered with devmon itself.

It would be nice if you share your templates. You can also contact 
me
via user-674624a37c09@xymon.invalid at daimler dot com

Thank you very much

Thorsten

Craig

-----Original Message-----
From: David Baldwin [mailto:user-cbbf693f2c89@xymon.invalid]
Sent: Thursday, 10 September 2009 11:53 AM
To: user-ae9b8668bcde@xymon.invalid
Subject: Re: [hobbit] UPS monitoring using devmon

user-9219fb9415b1@xymon.invalid wrote:
Hi,

sorry if this a little offtopic, but maybe someone has a 
devmon
template for the following UPS:

Emerson/Liebert HiPulse MM
MGE Galaxy Single
MGE Upsilon STS_100 Cross switch
MGE Upsilon STS_20 Cross switch

Thank you
Thorsten


If you are not the intended addressee, please inform us
immediately
that you have received this e-mail in error, and delete it. We
thank
you for your cooperation.
I have a template for the standard UPS MIB which I could send
through. I
have a little more work to do on it splitting the power status
into 3
tests for input, output and battery. Should give me motivation 
to
complete this today and I will upload to devmon SF site.

You can check if your UPS supports the standard UPS MIB or wants 
a
proprietary one (substituting for myups below, and your
community string
if it is not "public"):

snmpwalk -v2c -cpublic myups 1.3.6.1.2.1.33.1.1

Note that some UPS devices only support SNMPv1, so try -v1
instead of
-v2c above if it doesn't work.

David.

--
David Baldwin - IT Unit
Australian Sports Commission          www.ausport.gov.au
Tel 02 62147830 Fax 02 62141830       PO Box 176 Belconnen ACT 
2616
user-cbbf693f2c89@xymon.invalid          Leverrier Street Bruce ACT
2617

Keep up to date with what's happening in Australian sport visit
http://www.ausport.gov.au

This message is intended for the addressee named and may contain
confidential and privileged information. If you are not the 
intended
recipient please note that any form of distribution, copying or
use of
this communication or the information in it is strictly
prohibited and
may be unlawful. If you receive this message in error, please
delete it
and notify the sender.
DISCLAIMER:


The information contained in this email message is confidential 
and
for the attention of the intended recipient only.
It is not necessarily the official view or communication of the
Rodney District Council.
If you are not the intended recipient you must not disclose, 
copy or
distribute this message or the information in it.
If you have received this message in error, please delete or 
destroy
all copies of the email and notify the sender immediately.
Rodney District Council  accepts no responsibility for any 
effects
this email message or attachments has on the recipient network 
or
computer system.

If you are not the intended addressee, please inform us 
immediately
that you have received this e-mail in error, and delete it. We 
thank
you for your cooperation.

NOTICE: This email and any attachments are confidential.
They may contain legally privileged information or
copyright material. You must not read, copy, use or
disclose them without authorisation. If you are not an
intended recipient, please contact us at once by return
email and then delete both messages and all attachments.

NOTICE: This email and any attachments are confidential.
They may contain legally privileged information or
copyright material. You must not read, copy, use or
disclose them without authorisation. If you are not an
intended recipient, please contact us at once by return
email and then delete both messages and all attachments.
--
David Baldwin - IT Unit
Australian Sports Commission          www.ausport.gov.au
Tel 02 62147830 Fax 02 62141830       PO Box 176 Belconnen ACT 2616
user-cbbf693f2c89@xymon.invalid          Leverrier Street Bruce ACT 
2617
If you are not the intended addressee, please inform us immediately
that you have received this e-mail in error, and delete it. We thank
you for your cooperation.
-- 
David Baldwin - IT Unit
Australian Sports Commission          www.ausport.gov.au
Tel 02 62147830 Fax 02 62141830       PO Box 176 Belconnen ACT 2616
user-cbbf693f2c89@xymon.invalid          Leverrier Street Bruce ACT 2617

If you are not the intended addressee, please inform us immediately that you have received this e-mail in error, and delete it. We thank you for your cooperation.