Xymon Mailing List Archive search

UPS monitoring using devmon

list David Baldwin
Thu, 10 Sep 2009 17:51:49 +1000
Message-Id: <user-8f1e4dd56fa8@xymon.invalid>

Sorry all, didn't get time today to update the UPS template today.

It has separate graphs for input voltage, output phase voltages, battery
levels and separate tests/alerting on various conditions - input
voltage=0, battery time remaining <15 mins, output phase overload, etc.

David.
Oops, forgot to add the graphing bit.
 
Usual stuff.
In hobbitserver.cfg
Add "ups=ncv" to TEST2RRD=
Add "ups" to GRAPHS=
Add line NCV_ups="Load:GAUGE,Charge:GAUGE"
 
Add this to hobbitgraph.cfg
[ups]
        TITLE UPS Charge
        YAXIS Power
        -u 100
        -l 0
        DEF:u=ups.rrd:Charge:AVERAGE
        DEF:p=ups.rrd:Load:AVERAGE
        LINE2:u#00CC00:Charge
        LINE2:p#0000FF:Load
        COMMENT:\n
        GPRINT:u:LAST:Charge   \: %5.1lf%s (cur)
        GPRINT:u:MAX: \: %5.1lf%s (max)
        GPRINT:u:MIN: \: %5.1lf%s (min)
        GPRINT:u:AVERAGE: \: %5.1lf%s (avg)\n
        GPRINT:p:LAST:Load   \: %5.1lf%s (cur)
        GPRINT:p:MAX: \: %5.1lf%s (max)
        GPRINT:p:MIN: \: %5.1lf%s (min)
        GPRINT:p:AVERAGE: \: %5.1lf%s (avg)\n
Cheers
     Vernon
 
 
*From:* Everett, Vernon [mailto:user-9da1a1882f49@xymon.invalid]
*Sent:* Thursday, 10 September 2009 2:50 PM
*To:* 'user-ae9b8668bcde@xymon.invalid'
*Subject:* RE: [hobbit] Antwort: RE: [hobbit] UPS monitoring using devmon

I wrote this a while back for our MGEs.
As you can see, it predates my introduction to devmon and indicates a
complete lack of understanding of SNMP.
That being said, the code and the MIBs might give you a good
indication of where to start.
(Either that, or you can just use it as "good enough")
This was designed to run on the hobbit/xymon server.
 
I have been meaning to rewrite this, or move it to devmon, but just
haven't had the time.
(And it's doing an adequate job for now.)
 
Cheers
      V
 
bb-host entries look like this
1.2.3.4     galaxy3000    # http://1.2.3.4/ups_prop.htm ups galaxy3000
COMMENT:"Meaningful Comment if required"
2.3.4.5   Karratha_UPS    # http://2.3.4.5/ups_prop.htm ups galaxy3000
COMMENT:"Insert Comment"

 
--- snip---
 
cat ups.ksh
#!/bin/ksh
DATE=$(date)
#set -x
SPACER="                                                                           
"
BBTMP=/tmp
#BBHOSTS=/etc/hobbit/bb-hosts
#OUT=$BBTMP/upspage
BBHOSTLIST="$BBHOSTS $(grep ^include $BBHOSTS | awk '{ print $2 }')" #
Make sure we read the include files too
#grep -h " ups " $BBHOSTLIST | egrep -v "^page|^include"
grep -h " ups " $BBHOSTLIST | egrep -v "^page|^include" | while read
IP UPSNAME HASH URL UPS TYPE OTHER
do
   echo $IP $UPSNAME
   ping -c1 $IP > /dev/null
   if [ $? -eq 0 ]
   then
      COLOUR=green
      case $TYPE in
         galaxy3000)
$OUT.warn
$OUT.tmp
            TEMP=$(snmpget -v1 -c public $IP SNMPv2-SMI::mib-2.33.1.1.1.0)
            DEVICE=${TEMP##*:}       # String
            TEMP=$(snmpget -v1 -c public $IP SNMPv2-SMI::mib-2.33.1.1.2.0)
            MODEL=${TEMP##*:}        # String
            TEMP=$(snmpget -v1 -c public $IP SNMPv2-MIB::sysLocation.0)
            LOCATION=${TEMP##*:}             # String
            TEMP=$(snmpget -v1 -c public $IP SNMPv2-SMI::mib-2.33.1.1.4.0)
            SERIAL=${TEMP##*:}               # String
            TEMP=$(snmpget -v1 -c public $IP 1.3.6.1.4.1.705.1.4.8.0)
            LOWBATTERY=${TEMP##*:}           # Integer % Point at
which shutdown triggered
            TEMP=$(snmpget -v1 -c public $IP 1.3.6.1.4.1.705.1.5.1.0)
            TIME_REMAIN=${TEMP##*:}     # Integer seconds
            TEMP=$(snmpget -v1 -c public $IP 1.3.6.1.4.1.705.1.5.2.0)
            BATTERY_LEVEL=${TEMP##*:}   # Integer %
            TEMP=$(snmpget -v1 -c public $IP 1.3.6.1.4.1.705.1.5.9.0)
            BATTERY_FAULT=${TEMP##*:}   # Integer 1=yes 2=no
            TEMP=$(snmpget -v1 -c public $IP 1.3.6.1.4.1.705.1.5.11.0)
            BATTERY_REPLACE=${TEMP##*:} # Integer 1=yes 2=no
            TEMP=$(snmpget -v1 -c public $IP 1.3.6.1.4.1.705.1.5.15.0)
            CHARGER_FAULT=${TEMP##*:}   # Integer 1=yes 2=no
            TEMP=$(snmpget -v1 -c public $IP 1.3.6.1.4.1.705.1.7.3.0)
            OUT_ON_BAT=${TEMP##*:}      # Integer 1=yes 2=no
            TEMP=$(snmpget -v1 -c public $IP 1.3.6.1.4.1.705.1.7.4.0)
            OUT_ON_BYPASS=${TEMP##*:}   # Integer 1=yes 2=no
            TEMP=$(snmpget -v1 -c public $IP 1.3.6.1.4.1.705.1.7.10.0)
            OUT_OVERLOAD=${TEMP##*:}    # Integer 1=yes 2=no
            TEMP=$(snmpget -v1 -c public $IP 1.3.6.1.4.1.705.1.7.10.0)
            COMMSOK=${TEMP##*:}         # Integer 1=yes 2=no
            TEMP=$(snmpget -v1 -c public $IP 1.3.6.1.4.1.705.1.6.1.0)
            INPHASES=${TEMP##*:}        # Integer 1 or 3
 
            TEMP=$(snmpget -v1 -c public $IP 1.3.6.1.4.1.705.1.6.2.1.2)
            INVOLT=${TEMP##*:}          # Integer 10ths of a volt
 
            TEMP=$(snmpget -v1 -c public $IP 1.3.6.1.4.1.705.1.6.2.1.3)
            INFREQ=${TEMP##*:}          # Integer 10ths of a Hertz
 
            TEMP=$(snmpget -v1 -c public $IP 1.3.6.1.4.1.705.1.6.2.1.6)
            INAMPS=${TEMP##*:}          # Integer 10ths of an Amp
 
            TEMP=$(snmpget -v1 -c public $IP 1.3.6.1.4.1.705.1.6.3.0)
            INOK=${TEMP##*:}            # Integer 1=yes 2=no
            TEMP=$(snmpget -v1 -c public $IP 1.3.6.1.4.1.705.1.6.4.0)
            INFAILCAUSE=${TEMP##*:}     # Integer 1=no fault
                                     #         2=bad voltage
                                     #         3=bad frequency
                                     #         4=no voltage
            TEMP=$(snmpget -v1 -c public $IP 1.3.6.1.4.1.705.1.7.1.0)
            OUTPHASES=${TEMP##*:}       # Integer 1 or 3
            TEMP=$(snmpget -v1 -c public $IP 1.3.6.1.4.1.705.1.7.2.1.2)
            OUTVOLT=${TEMP##*:}         # Integer 10ths of a volt
            TEMP=$(snmpget -v1 -c public $IP 1.3.6.1.4.1.705.1.7.2.1.3)
            OUTFREQ=${TEMP##*:}         # Integer 10ths of a Hertz
            TEMP=$(snmpget -v1 -c public $IP 1.3.6.1.4.1.705.1.7.2.1.4)
            OUTLOAD=${TEMP##*:}         # Integer %
            TEMP=$(snmpget -v1 -c public $IP 1.3.6.1.4.1.705.1.7.2.1.5)
            OUTAMPS=${TEMP##*:}         # Integer 10ths of an Amp
            TEMP=$(snmpget -v1 -c public $IP 1.3.6.1.4.1.705.1.7.11.0)
            OVERTEMP=${TEMP##*:}        # Integer 1=yes 2=no
            echo "Manufacturer                          "$DEVICE | sed
's/"//g' >> $OUT.tmp
            echo "Model                                "$MODEL | sed
's/"//g' >> $OUT.tmp
            echo "Serial Number                        "$SERIAL | sed
's/"//g' >> $OUT.tmp
            echo "Location                             "$LOCATION |
sed 's/"//g' >> $OUT.tmp
            echo >> $OUT.tmp
            #$LOWBATTERY
            #WARN=""
            #[ -z "$TIME_REMAIN" ] && TIME_REMAIN=0
            #[ $TIME_REMAIN -lt 1200 -a "$COLOUR" != "red" ] &&
COLOUR=yellow
            #[ $TIME_REMAIN -lt 1200 ] && WARN="Time Remaining low"
            #[ $TIME_REMAIN -lt 600 ] && COLOUR=red
            #[ $TIME_REMAIN -lt 600 ] && WARN="Time Remaining critical"
            #[ "$WARN" != "" ] && echo $WARN >> $OUT.warn
            ((s=$TIME_REMAIN%60))
            ((m=$TIME_REMAIN/60))
            echo "Time Remaining                        "$m Minutes $s
Seconds >> $OUT.tmp
            WARN=""
            [ -z "$BATTERY_LEVEL" ] && BATTERY_LEVEL=0
            [ $BATTERY_LEVEL -lt 95 -a "$COLOUR" != "red" ] &&
COLOUR=yellow
            [ $BATTERY_LEVEL -lt 95 ] && WARN="Battery level low"
            [ $BATTERY_LEVEL -lt 50 ] && COLOUR=red
            [ $BATTERY_LEVEL -lt 50 ] && WARN="Battery level critical"
            [ "$WARN" != "" ] && echo $WARN >> $OUT.warn
            echo "Battery Level                        "$BATTERY_LEVEL
% >> $OUT.tmp
            BATTERY_LEVEL=$(echo $BATTERY_LEVEL | sed 's/[ \t]*//')
            if [ $BATTERY_FAULT -eq 1 ]
            then
               BF=Yes
               COLOUR=red
               echo "Battery Fault!" >> $OUT.warn
            else
               BF=No
            fi
            echo "Battery Fault                         "$BF >> $OUT.tmp
            if [ $BATTERY_REPLACE -eq 1 ]
            then
               BR=Yes
               COLOUR=red
               echo "Battery replacement required" >> $OUT.warn
            else
               BR=No
            fi
            echo "Replace Battery                       "$BR >> $OUT.tmp
            if [ $BATTERY_FAULT -eq 1 ]
            then
               BR=Yes
               COLOUR=red
               echo "Battery replacement required" >> $OUT.warn
            else
               BR=No
            fi
            if [ $CHARGER_FAULT -eq 1 ]
            then
               CF=Yes
               COLOUR=red
               echo "Charger Fault" >> $OUT.warn
            else
               CF=No
            fi
            echo "Charger Fault                         "$CF >> $OUT.tmp
            if [ $OUT_ON_BAT -eq 1 ]
            then
               OUT_ON_BAT=Yes
               COLOUR=red
               echo "UPS running on battery" >> $OUT.warn
            else
               OUT_ON_BAT=No
            fi
            echo "On Battery                            "$OUT_ON_BAT
$OUT.tmp
            if [ $OUT_ON_BYPASS -eq 1 ]
            then
               OUT_ON_BYPASS=Yes
               COLOUR=red
               echo "UPS on power bypass" >> $OUT.warn
            else
               OUT_ON_BYPASS=No
            fi
            echo "On Bypass                            
"$OUT_ON_BYPASS >> $OUT.tmp
            if [ $OUT_OVERLOAD -eq 1 ]
            then
               OUT_OVERLOAD=Yes
               COLOUR=red
               echo "UPS output overload" >> $OUT.warn
            else
               OUT_OVERLOAD=No
            fi
            echo "Battery Overload                      "$OUT_OVERLOAD
$OUT.tmp
            if [ $OVERTEMP -eq 1 ]
            then
               OVERTEMP=Yes
               COLOUR=red
               echo "Unit overheating" >> $OUT.warn
            else
               OVERTEMP=No
            fi
            echo "Unit Overheating                      "$OVERTEMP >>
$OUT.tmp
            #if [ $COMMSOK -eq 2 ]
            #then
            #   COMMSOK=No
            #   COLOUR=red
            #   echo "No comms from device" >> $OUT.warn
            #else
            #   COMMSOK=Yes
            #fi
            #echo "Comms OK                              "$COMMSOK >>
$OUT.tmp
            echo >> $OUT.tmp
            echo "Input Phases                         "$INPHASES >>
$OUT.tmp
            INVOLT=$(echo "scale=1 ; $INVOLT/10" | bc)
            echo "Input Voltage                         "$INVOLT >>
$OUT.tmp
            INFREQ=$(echo "scale=1 ; $INFREQ/10" | bc)
            echo "Input Frequency                       "$INFREQ >>
$OUT.tmp
            INAMPS=$(echo "scale=1 ; $INAMPS/10" | bc)
            echo "Input Current                         "$INAMPS >>
$OUT.tmp
            if [ $INOK -eq 1 ]
            then
               # A silly case of reverse logic applies here
               INOK=No
               COLOUR=red
               echo "Power input outside tollerance" >> $OUT.warn
            else
               INOK=Yes
            fi
            echo "Input OK                              "$INOK >> $OUT.tmp
            [ $INFAILCAUSE -eq 1 ] && FAILCAUSE="No failures"
            [ $INFAILCAUSE -eq 2 ] && FAILCAUSE="Voltage out of
tollearance"
            [ $INFAILCAUSE -eq 3 ] && FAILCAUSE="Frequency out of
tollernace"
            [ $INFAILCAUSE -eq 4 ] && FAILCAUSE="No voltage - power fail"
            echo "Cause of Failure                      "$FAILCAUSE >>
$OUT.tmp
            echo >> $OUT.tmp
            echo "Output Phases                        "$OUTPHASES >>
$OUT.tmp
            OUTVOLT=$(echo "scale=1 ; $OUTVOLT/10" | bc)
            echo "Output Voltage                        "$OUTVOLT >>
$OUT.tmp
            OUTFREQ=$(echo "scale=1 ; $OUTFREQ/10" | bc)
            echo "Output Frequency                      "$OUTFREQ >>
$OUT.tmp
            OUTAMPS=$(echo "scale=1 ; $OUTAMPS/10" | bc)
            echo "Output Current                        "$OUTAMPS >>
$OUT.tmp
            OUTLOAD=$(echo $OUTLOAD | sed 's/[ \t]*//')
            echo "Output Load                           "$OUTLOAD % >>
$OUT.tmp
            echo >> $OUT.final
            cat $OUT.warn >> $OUT.final
            cat $OUT.tmp >> $OUT.final
            echo '<FONT COLOR="Black">' >> $OUT.final
            echo "Load=$OUTLOAD" >> $OUT.final
            echo "Charge=$BATTERY_LEVEL" >> $OUT.final
            echo '</FONT>' >> $OUT.final
            rm $OUT.tmp
            rm $OUT.warn
       esac
    else
       echo "Device Unreachable!" >> $OUT.final
       #COLOUR=yellow
    fi
    $BB $BBDISP "status $UPSNAME.ups $COLOUR $DATE $(cat $OUT.final)"
    rm $OUT.final
done
 
--- snip ---
 
 
*From:* user-9219fb9415b1@xymon.invalid
[mailto:user-9219fb9415b1@xymon.invalid]
*Sent:* Thursday, 10 September 2009 2:18 PM
*To:* user-ae9b8668bcde@xymon.invalid
*Subject:* [hobbit] Antwort: RE: [hobbit] UPS monitoring using devmon


Hi Craig,
hi David,

sounds very nice.

I tried that snmpwalk. It only works with snmp V1. This is the output:

SNMPv2-SMI::mib-2.33.1.1.1.0 = STRING: "MGE UPS SYSTEMS"
SNMPv2-SMI::mib-2.33.1.1.2.0 = STRING: "Galaxy PW Single//"
SNMPv2-SMI::mib-2.33.1.1.3.0 = ""
SNMPv2-SMI::mib-2.33.1.1.4.0 = STRING: "GB (SN 49EE49044)"
SNMPv2-SMI::mib-2.33.1.1.5.0 = ""
SNMPv2-SMI::mib-2.33.1.1.6.0 = ""

As I see it, this is only the description of the UPS. I will look for
the MIB of my devices.
Don't know where all the other things which I want to monitor at
minimum are located in the MIB like:

Overall status or am I runnung on battery
output power
remaining battery time
temperature
I'll look at uploading it to the SF repository.
Maybe this is a stupid question, but I don't find templates and things
on the SF page. Only devmon itself.
I look at http://devmon.sourceforge.net/ and at
http://sourceforge.net/projects/devmon/
There is a devmon-template file there, but as I see it, it contains
only the templates which are delivered with devmon itself.

It would be nice if you share your templates. You can also contact me
via user-674624a37c09@xymon.invalid at daimler dot com

Thank you very much

Thorsten

Craig

-----Original Message-----
From: David Baldwin [mailto:user-cbbf693f2c89@xymon.invalid]
Sent: Thursday, 10 September 2009 11:53 AM
To: user-ae9b8668bcde@xymon.invalid
Subject: Re: [hobbit] UPS monitoring using devmon

user-9219fb9415b1@xymon.invalid wrote:
Hi,

sorry if this a little offtopic, but maybe someone has a devmon
template for the following UPS:

Emerson/Liebert HiPulse MM
MGE Galaxy Single
MGE Upsilon STS_100 Cross switch
MGE Upsilon STS_20 Cross switch

Thank you
Thorsten


If you are not the intended addressee, please inform us immediately
that you have received this e-mail in error, and delete it. We thank
you for your cooperation.
I have a template for the standard UPS MIB which I could send through. I
have a little more work to do on it splitting the power status into 3
tests for input, output and battery. Should give me motivation to
complete this today and I will upload to devmon SF site.

You can check if your UPS supports the standard UPS MIB or wants a
proprietary one (substituting for myups below, and your community string
if it is not "public"):

snmpwalk -v2c -cpublic myups 1.3.6.1.2.1.33.1.1

Note that some UPS devices only support SNMPv1, so try -v1 instead of
-v2c above if it doesn't work.

David.

--
David Baldwin - IT Unit
Australian Sports Commission          www.ausport.gov.au
Tel 02 62147830 Fax 02 62141830       PO Box 176 Belconnen ACT 2616
user-cbbf693f2c89@xymon.invalid          Leverrier Street Bruce ACT 2617


Keep up to date with what's happening in Australian sport visit
http://www.ausport.gov.au

This message is intended for the addressee named and may contain
confidential and privileged information. If you are not the intended
recipient please note that any form of distribution, copying or use of
this communication or the information in it is strictly prohibited and
may be unlawful. If you receive this message in error, please delete it
and notify the sender.


DISCLAIMER:
 

The information contained in this email message is confidential and
for the attention of the intended recipient only.
It is not necessarily the official view or communication of the
Rodney District Council.
If you are not the intended recipient you must not disclose, copy or
distribute this message or the information in it.
If you have received this message in error, please delete or destroy
all copies of the email and notify the sender immediately.
Rodney District Council  accepts no responsibility for any effects
this email message or attachments has on the recipient network or
computer system.

If you are not the intended addressee, please inform us immediately
that you have received this e-mail in error, and delete it. We thank
you for your cooperation.

NOTICE: This email and any attachments are confidential. 
They may contain legally privileged information or 
copyright material. You must not read, copy, use or 
disclose them without authorisation. If you are not an 
intended recipient, please contact us at once by return 
email and then delete both messages and all attachments.
  
NOTICE: This email and any attachments are confidential. 
They may contain legally privileged information or 
copyright material. You must not read, copy, use or 
disclose them without authorisation. If you are not an 
intended recipient, please contact us at once by return 
email and then delete both messages and all attachments.
  
-- 
David Baldwin - IT Unit
Australian Sports Commission          www.ausport.gov.au
Tel 02 62147830 Fax 02 62141830       PO Box 176 Belconnen ACT 2616
user-cbbf693f2c89@xymon.invalid          Leverrier Street Bruce ACT 2617