Xymon Mailing List Archive search

Faulty downtime

6 messages in this thread

list Martha McConaghy · Thu, 18 Jun 09 10:13:14 EDT ·
We're running the 4.3.0.0-beta2 level of Xymon, but saw this problem with
the previous beta level too.  The downtime option in bb-hosts does not seem
to work for servers that are running a Xymon client.  No matter how you set
it (i.e. what days or times), the tests never go blue.  It does show up as
planned downtime when you display the "Info" about the server, but it never
actually happens.  (Consequently, alerts are generated during the outage
time.)

The downtime option works fine for servers that are not running a client,
though.

Has anyone else seen this?  I suspect this is a bug.  Any ideas how to
work around it?

Martha McConaghy
Marist College IT
list Dominique Frise · Thu, 18 Jun 2009 17:25:47 +0200 ·
quoted from Martha McConaghy
Martha McConaghy wrote:
We're running the 4.3.0.0-beta2 level of Xymon, but saw this problem with
the previous beta level too.  The downtime option in bb-hosts does not seem
to work for servers that are running a Xymon client.  No matter how you set
it (i.e. what days or times), the tests never go blue.  It does show up as
planned downtime when you display the "Info" about the server, but it never
actually happens.  (Consequently, alerts are generated during the outage
time.)

The downtime option works fine for servers that are not running a client,
though.

Has anyone else seen this?  I suspect this is a bug.  Any ideas how to
work around it?

Martha McConaghy
Marist College IT

 From our experience there are two issues with DOWNTIME

1) The doc. in bb-hosts(5) should be:

DOWNTIME=[columns:]day:starttime:endtime:cause[,[columns:]day:starttime:endtime:cause]


2) The info column does not reflect a proper DOWNTIME configuration.

Example:
A valid and working configuration like "DOWNTIME=ldap:1:0729:0734:Restart" will be shown as "Planned downtime: ldap:1:0729:0734:Restart" on the info page (the 1 is not decoded as Monday).

See also http://www.hswn.dk/hobbiton/2009/04/msg00301.html


You can check your DOWNTIME settings with hobbitd_alert(8) --test HOST SERVICE [options]


Dominique
list Martha McConaghy · Thu, 18 Jun 09 17:55:15 EDT ·
Thanks for the advice.  Those issues don't seem to apply here.  The
definition is pretty straight forward:

10.10.1.155 formfusion # conn nopropred:msgs http://formfusion.ad.marist.edu
       downtime=*:0050:0115:"Regular service restart"

The history should show it going blue from 00:50 to 01:15 each night, but
it doesn't.  Alerts periodically get generated too.

I did check it with hobbitd_alert as you suggested.  Without the time option,
it shows a match.  When I use the time option, i.e. --time 1245373200
there is no match.  That suggests to me that we should never get an alert
during that time.  Yet we do, 2 or 3 times a week.  Nearly always at 1:05.


Martha


On Thu, 18 Jun 2009 17:25:47 +0200 Dominique Frise said:
quoted from Martha McConaghy
Martha McConaghy wrote:
We're running the 4.3.0.0-beta2 level of Xymon, but saw this problem with
the previous beta level too.  The downtime option in bb-hosts does not seem
to work for servers that are running a Xymon client.  No matter how you set
it (i.e. what days or times), the tests never go blue.  It does show up as
planned downtime when you display the "Info" about the server, but it never
actually happens.  (Consequently, alerts are generated during the outage
time.)

The downtime option works fine for servers that are not running a client,
though.

Has anyone else seen this?  I suspect this is a bug.  Any ideas how to
work around it?

Martha McConaghy
Marist College IT

From our experience there are two issues with DOWNTIME

1) The doc. in bb-hosts(5) should be:

DOWNTIME=[columns:]day:starttime:endtime:cause[,[columns:]day:starttime:endtime
quoted from Dominique Frise


2) The info column does not reflect a proper DOWNTIME configuration.

Example:
A valid and working configuration like
"DOWNTIME=ldap:1:0729:0734:Restart" will be shown as "Planned downtime:
ldap:1:0729:0734:Restart" on the info page (the 1 is not decoded as Monday).

See also http://www.hswn.dk/hobbiton/2009/04/msg00301.html


You can check your DOWNTIME settings with hobbitd_alert(8) --test HOST
SERVICE [options]


Dominique

list Andreas Kunberger · Fri, 19 Jun 2009 08:08:34 +0200 ·
quoted from Martha McConaghy
Martha McConaghy schrieb:
Thanks for the advice.  Those issues don't seem to apply here.  The
definition is pretty straight forward:

10.10.1.155 formfusion # conn nopropred:msgs http://formfusion.ad.marist.edu
       downtime=*:0050:0115:"Regular service restart"

  
My downtime tags look like
    DOWNTIME=*:*:0030:0300:backup
and they are working with xymon 4.2.3

mfg
Andreas
list Dominique Frise · Fri, 19 Jun 2009 08:31:01 +0200 ·
quoted from Martha McConaghy
Martha McConaghy wrote:
Thanks for the advice.  Those issues don't seem to apply here.  The
definition is pretty straight forward:

10.10.1.155 formfusion # conn nopropred:msgs http://formfusion.ad.marist.edu
       downtime=*:0050:0115:"Regular service restart"

The history should show it going blue from 00:50 to 01:15 each night, but
it doesn't.  Alerts periodically get generated too.
DOWNTIME will convert yellow/red alerts to blue during the specified 
time. If there is no alert, the status will stay green over the DOWNTIME 
period.
quoted from Martha McConaghy
I did check it with hobbitd_alert as you suggested.  Without the time option,
it shows a match.  When I use the time option, i.e. --time 1245373200
there is no match.  That suggests to me that we should never get an alert
during that time.  Yet we do, 2 or 3 times a week.  Nearly always at 1:05.
Give this a try (downtime must in be capital letters):

DOWNTIME=*:0050:0115:"Regular service restart"


Dominique
list Johan Sjöberg · Fri, 19 Jun 2009 08:57:03 +0200 ·
I think the info page parses the downtime tags in the wrong way. My tags are formatted like column:day:start:end:comment, and that works, but the info page parses them like day:column:start:end:comment
But the conclusion I came to is that you need both the column and day fields to get it working.

/Johan
quoted from Andreas Kunberger

-----Original Message-----
From: Andreas Kunberger [mailto:user-6b0b54288086@xymon.invalid] 
Sent: den 19 juni 2009 08:09
To: user-ae9b8668bcde@xymon.invalid
Subject: Re: [hobbit] Faulty downtime

Martha McConaghy schrieb:
Thanks for the advice.  Those issues don't seem to apply here.  The
definition is pretty straight forward:

10.10.1.155 formfusion # conn nopropred:msgs http://formfusion.ad.marist.edu
       downtime=*:0050:0115:"Regular service restart"

  
My downtime tags look like
    DOWNTIME=*:*:0030:0300:backup
and they are working with xymon 4.2.3

mfg
Andreas