Conn/Ping Test/Graph to Secondary (Backup) IP
list Scott Driemeier-Showers
Good afternoon,
Having just upgraded from Xymon 4.2.3 on CentOS 5.11, our environment is now Xymon 4.3.28-1.el7.terabithia on CentOS 7.4.
We have a group of systems defined that monitors (via conn/ping) the internal interface of the primary router at each of our remote offices to tell us whether or not the connection is up. Each office also has a secondary/backup network connection that is used for load balancing, VoIP, and failover.
In order to see each site as a single row on the web page we wrote a client-side extension to ping the backup router's internal interface. The monitoring, display, and alerting all work like we'd hoped (and has for a long time). Unlike the primary connection's monitor, however, the backup connection is not creating a RRD file so can't be graphed. I am trying to fix that, and have been looking at the help page for setting up a custom graph but I can't seem to connect the dots properly.
The key logic from /etc/xymon-client/ext/voiptest.sh are:
##
## Logic to build remote site list happens first
##
foreach my $voipsvr (@rmt_names) {
$color = GREEN;
$status = $bbtest . " ok";
$DATA = "";
##
## Logic to determine secondary IP from primary IP happens here
##
my ($voipip, $svrname, $junk) = split(/:/, $voipsvr, 3);
my $pingres = `/usr/sbin/fping -Ae $voipip`;
$DATA .= $pingres;
if ($pingres =~ "unreachable") {
$color = RED;
$status = $bbtest . " NOT ok"
}
## Send to Hobbit
#############################################################################
my $report_date = `/bin/date`;
chomp($report_date);
system("$ENV{XYMON} $ENV{XYMSRV} 'status $svrname.voip $color $report_date - $status\n\n$DATA'\n");
}
Could anybody point me in the right direction to get the RRD graph for each secondary TCP/Ping test to work?
Thanks,
Scott
P.S. -- The log/graph wasn't working prior to the upgrade either. We just upgraded first to make sure our environment was current...
P.P.S. -- We may have set this test up incorrectly, so please feel free to teach us if there is a better way to have done it...
list Galen Johnson
Is there anything in the xymon rrd log? Have you set up the NCV pieces in
the configs? For example, I have a test that looks at the thread count for
an internal service...I have a file under /etc/xymon/xymonserver.cfg.d
named after the test (probably not required but helps keep the associations
clear)...I have the following in the file:
GRAPHS_threads="threads"
TEST2RRD+=",threads=ncv"
SPLITNCV_threads="*:NONE,total:GAUGE,consumed:GAUGE"
I prefer splitncv to make xymon split the rrd data into multiple individual
files. The output in the test that drives the graph detail is:
# Tell Xymon about it
$XYMON $XYMSRV "status $MACHINE.$COLUMN $COLOR $TIMESTAMP
${MSG}
<!--
pctutil: ${THREAD_UTIL}
total: ${TOTAL_THREADS}
consumed: ${THREADS_CONSUMED}
-->
Summary
${COMPONENT} - $THREAD_UTIL% threads used
Details
${STATS_nocolon}
"
the content with the 'var: is what contains the data to be tracked in
rrd. It's not clear from the snippet you provided that your message
contains this or if it's being sent across a different channel (I'm
guessing not). Basically, Xymon may not even know it has data to collect
since it's not being put in a format it expects. IIRC, it should parse
either "var: value" and "var=value".
=G=
note: there is also a graph definition in /etcxymon/graphs.d
PPS: I'm using the terabithia rpms as well
PPPS: getting the user defined graphs working the way you want can be
tricky...I almost always have to go rooting through the archives.
On Mon, Dec 18, 2017 at 1:32 PM, Scott Driemeier-Showers <
▸
user-8d275c0639b6@xymon.invalid> wrote:
Good afternoon,
Having just upgraded from Xymon 4.2.3 on CentOS 5.11, our environment is
now Xymon 4.3.28-1.el7.terabithia on CentOS 7.4.
We have a group of systems defined that monitors (via conn/ping) the
internal interface of the primary router at each of our remote offices to
tell us whether or not the connection is up. Each office also has a
secondary/backup network connection that is used for load balancing, VoIP,
and failover.
In order to see each site as a single row on the web page we wrote a
client-side extension to ping the backup router’s internal interface. The
monitoring, display, and alerting all work like we’d hoped (and has for a
long time). Unlike the primary connection’s monitor, however, the backup
connection is not creating a RRD file so can’t be graphed. I am trying to
fix that, and have been looking at the help page for setting up a custom
graph but I can’t seem to connect the dots properly.
The key logic from /etc/xymon-client/ext/voiptest.sh are:
##
## Logic to build remote site list happens first
##
foreach my $voipsvr (@rmt_names) {
$color = GREEN;
$status = $bbtest . " ok";
$DATA = "";
##
## Logic to determine secondary IP from primary IP happens here
##
my ($voipip, $svrname, $junk) = split(/:/, $voipsvr, 3);
my $pingres = `/usr/sbin/fping -Ae $voipip`;
$DATA .= $pingres;
if ($pingres =~ "unreachable") {
$color = RED;
$status = $bbtest . " NOT ok"
}
## Send to Hobbit
##########################################################
###################
my $report_date = `/bin/date`;
chomp($report_date);
system("$ENV{XYMON} $ENV{XYMSRV} 'status $svrname.voip $color
$report_date - $status\n\n$DATA'\n");
}
Could anybody point me in the right direction to get the RRD graph for
each secondary TCP/Ping test to work?
Thanks,
Scott
P.S. -- The log/graph wasn’t working prior to the upgrade either. We just
upgraded first to make sure our environment was current…
P.P.S. -- We may have set this test up incorrectly, so please feel free to
teach us if there is a better way to have done it…
list Scott Driemeier-Showers
I have been attempting to use the “standard” tcp (conn) tests and pre-defined graphs, rather than building something “custom”.
In /etc/xymon/xymonserver.cfg I added:
TEST2RRD=” … ,voip=tcp”
GRAPHS=” … ,voip”
I did not configure it to use NCV as I understood, based on my reading online, that was only necessary with output formatted as colon-separated “columnar” data. Since this test runs a simple fping the output looks like this:
10.10.16.251 is alive (25.8 ms)
I rewrote the script yesterday as a shell script (no perl). Here’s the entire thing:
#!/bin/sh
COLUMN=voip
for RMT in `cat /etc/xymon/hosts.cfg | egrep "vya|tal" | tr -s ' ' | cut -d" " -f1,2 --output-delimiter=,`
do
ADDR=`echo $RMT | awk -F, '{print $1}'`
SITE=`echo $RMT | awk -F, '{print $2}'`
RSLT=ok
STAT=green
VOIP=${ADDR/192.168/10.10}
VOIP=${VOIP/.5./.6.}
SRVR=${SITE//./,}
DATA=`/usr/sbin/fping -Ae $VOIP`
if [[ $DATA == *"unreachable"* ]]
then
RSLT="NOT ok"
STAT=red
fi
MESG="Service $COLUMN on $SITE is $RSLT"
# Results to Xymon
#echo $ADDR $SRVR $DATA $STAT $MESG
$XYMON $XYMSRV "status $SRVR.$COLUMN $STAT `date +"%a %b %d %H:%M:%S %Y"` $COLUMN $RSLT
${MESG}
${DATA}
"
done
exit 0
Unfortunately no, the RRD log files are not being created either – so it looks like I’ve missed some critical steps there too.
Thanks,
Scott
list Galen Johnson
I expect you're going to have to use NCV to generate the rrds. I don't think Xymon will look at a number and just decide to track it. I expect someone will correct me if this is not the case. I'd focus on getting Xymon to create and populate the rrd files, then worry about the graphing. =G= On Tue, Dec 19, 2017 at 5:42 PM, Scott Driemeier-Showers <
▸
user-8d275c0639b6@xymon.invalid> wrote:
I have been attempting to use the “standard” tcp (conn) tests and
pre-defined graphs, rather than building something “custom”.
In /etc/xymon/xymonserver.cfg I added:
TEST2RRD=” … ,voip=tcp”
GRAPHS=” … ,voip”
I did not configure it to use NCV as I understood, based on my reading
online, that was only necessary with output formatted as colon-separated
“columnar” data. Since this test runs a simple fping the output looks like
this:
10.10.16.251 is alive (25.8 ms)
I rewrote the script yesterday as a shell script (no perl). Here’s the
entire thing:
#!/bin/sh
COLUMN=voip
for RMT in `cat /etc/xymon/hosts.cfg | egrep "vya|tal" | tr -s ' ' | cut
-d" " -f1,2 --output-delimiter=,`
do
ADDR=`echo $RMT | awk -F, '{print $1}'`
SITE=`echo $RMT | awk -F, '{print $2}'`
RSLT=ok
STAT=green
VOIP=${ADDR/192.168/10.10}
VOIP=${VOIP/.5./.6.}
SRVR=${SITE//./,}
DATA=`/usr/sbin/fping -Ae $VOIP`
if [[ $DATA == *"unreachable"* ]]
then
RSLT="NOT ok"
STAT=red
fi
MESG="Service $COLUMN on $SITE is $RSLT"
# Results to Xymon
#echo $ADDR $SRVR $DATA $STAT $MESG
$XYMON $XYMSRV "status $SRVR.$COLUMN $STAT `date +"%a %b %d %H:%M:%S
%Y"` $COLUMN $RSLT
${MESG}
${DATA}
"
done
exit 0
Unfortunately no, the RRD log files are not being created either – so it
looks like I’ve missed some critical steps there too.
Thanks,
Scott