Hi
▸ quoted from Usa Ims
On 23 January 2015 at 06:43, usa ims via Xymon <xymon at xymon.com> wrote:
I have a custom client test that I only want it to be executed once every
day at 2:15am. After looking through the man pages of hosts.cfg and
alerts.cfg, I am confused on what is the best course of action. From
reading the archives, it was indicated to use the cron utility of Linux for
it to start at ‘2:15am’.
The tasks.cfg allows specifying a "CRONDATE" so you don't need to use Linux
cron, and it's probably better to keep it all within Xymon.
▸ quoted from Usa Ims
I would like to be alerted only once but the test to run every hour until
recovered.
So normally you want the test to run once only at 2:15am. But if it's in a
failed state, you want the test re-run every hour.
This is also what happens with the network tests, where the
xymonnet-again.sh script looks for failed network tests and re-runs them
every minute rather than the normal 5-minute interval. You can't make use
of the xymonnet-again.sh script because it only works with the standard
network tests (it simply runs xymonnet specifying the tests to repeat).
But you could use the same idea.
▸ quoted from Usa Ims
In the clientlaunch.cfg of the xymon/hobbit client, I’m going to put 15m,
am I correct?
Why 15 minutes? Did you mean 60 minutes?
[xxxxx]
ENVFILE $HOBBITCLIENTHOME/etc/hobbitclient.cfg
CMD $HOBBITCLIENTHOME/ext/xxxxx.pl
LOGFILE $HOBBITCLIENTHOME/logs/xxxxx.log
INTERVAL 60m
Here you could put (instead of INTERVAL) something like:
CRONDATE 15 2 * * *
to run at 2:15 every morning.
▸ quoted from Usa Ims
And in the script I’m going to put ‘status+24h’ so that it will not turn
purple.
Yep.
▸ quoted from Usa Ims
In the alerts.cfg, I have the following:
HOST=xxxxx
SCRIPT /etc/xymon/xxxxx-emails-geoxx.sh SERVICE=xxxxx COLOR=yellow
REPEAT=24h FORMAT=SCRIPT RECOVERED
Yep, so this will run the script (I'm guessing) to send an email when the
service fails and when it is restored.
What you could do is add another line to alerts.cfg to re-run your original
monitoring script for the test, but only when the test is failing.
Something like this (notice that I put the SERVICE on the first line so
that it applies to both SCRIPT lines):
HOST=xxxxx SERVICE=xxxxx
SCRIPT /etc/xymon/xxxxx-emails-geoxx.sh SERVICE=xxxxx
COLOR=yellow REPEAT=24h FORMAT=SCRIPT RECOVERED
SCRIPT $HOBBITCLIENTHOME/ext/xxxxx.pl "&host&" FORMAT=SCRIPT
DURATION>5m DURATION<16h REPEAT=1h
So the second line will mean that an error condition will cause a re-test,
and when the error condition stops, the re-test will also stop. Some
important things to note here are:
a) If the script that does the checks (xxxxx.pl) operates on several hosts,
then it should take as its parameter, the name of a host to limit its
checks on. Otherwise when one host fails, all of them would be re-tested.
b) The "DURATION" specification means that the re-test won't happen
immediately on failure, but will wait for the REPEAT interval of 1h.
There's no point re-testing within 3 seconds of the first test failing.
c) I've set the maximum DURATION to 16 hours because if you haven't fixed
the problem by 6pm it probably won't get done until the next day. Adjust
this as you see fit, but probably not worth having it more than 24h.
J