Xymon Mailing List Archive search

hobbit-clients.cfg DISK question

16 messages in this thread

list Dirk Kastens · Mon, 17 Oct 2005 17:58:25 +0200 ·
Hi,

I have one host with some filesystems that are always 100% full. So I wanted to disable the red alarm by setting the warnlevel and paniclevel in the hobbit-clients.cfg to 101. The filesystems are /tsm/data, /tsm/db1, and /tsm/db2.
I tried the following definitions without success:

HOST=glaukos.x.y.z
a) DISK /tsm/data 101 101

b) DISK /tsm/* 101 101

c) DISK %/tsm/* 101 101

d) DISK %^/tsm/* 101 101

e) DISK "%^/tsm/*" 101 101

The hobbit server always shows a red alarm. I use hobbit 4.1.2. What did I do wrong?

Regards,
Dirk
list Henrik Størner · Mon, 17 Oct 2005 18:02:46 +0200 ·
quoted from Dirk Kastens
On Mon, Oct 17, 2005 at 05:58:25PM +0200, Dirk Kastens wrote:
I have one host with some filesystems that are always 100% full. So I wanted to disable the red alarm by setting the warnlevel and paniclevel in the hobbit-clients.cfg to 101. The filesystems are /tsm/data, /tsm/db1, and /tsm/db2.
I tried the following definitions without success:

HOST=glaukos.x.y.z
a) DISK /tsm/data 101 101

b) DISK /tsm/* 101 101

c) DISK %/tsm/* 101 101

d) DISK %^/tsm/* 101 101

e) DISK "%^/tsm/*" 101 101
You hit one of the intricacies of regular expressions. "*" means
"zero or one of the patterns immediately to the left", so
"/*" matches "/", "//", "///" etc.

What you want is
   DISK %^/tsm/.* 101 101

because ".*" matches any character, zero or more times.


Regards,
Henrik
list Larry Barber · Mon, 17 Oct 2005 12:07:09 -0400 (EDT) ·
Did you put these after the DEFAULT entries? They should go in ahead of
the DEFAULT settings since Hobbit will take the first action that
matches. I've also noticed that the new setting sometimes won't be
picked up with out a restart. 
Thanks,
Larry Barber
quoted from Henrik Størner

On Mon, 2005-10-17 at 10:58 -0500, user-e4253f8fc63b@xymon.invalid wrote:
Hi,

I have one host with some filesystems that are always 100% full. So
I  wanted to disable the red alarm by setting the warnlevel and
paniclevel  in the hobbit-clients.cfg to 101. The filesystems are /tsm/data,  /tsm/db1, and /tsm/db2. I tried the following definitions without success:

HOST=glaukos.x.y.z a) DISK /tsm/data 101 101

b) DISK /tsm/* 101 101

c) DISK %/tsm/* 101 101

d) DISK %^/tsm/* 101 101

e) DISK "%^/tsm/*" 101 101

The hobbit server always shows a red alarm. I use hobbit 4.1.2. What
did  I do wrong?

Regards, Dirk

list Adam Scheblein · Mon, 17 Oct 2005 11:58:27 -0500 ·
Your rules should look like: (because these are what mine look like and
they work)

DISK %^/cdrom.*/ 101 102
DISK /mnt 101 102
DISK /specific/mount/point 98 99 HOST=hostname

Also, Keep in mind that rules are processed from the top down so if you
have your rule down at the bottom it will not be processed because there
is the default DISK rule.

ADam
quoted from Larry Barber

-----Original Message-----
From: user-7a6c75d6cc10@xymon.invalid [mailto:user-7a6c75d6cc10@xymon.invalid] Sent: Monday, October 17, 2005 11:07 AM
To: user-ae9b8668bcde@xymon.invalid
Subject: Re: [hobbit] hobbit-clients.cfg DISK question

Did you put these after the DEFAULT entries? They should go in ahead of
the DEFAULT settings since Hobbit will take the first action that
matches. I've also noticed that the new setting sometimes won't be
picked up with out a restart. 
Thanks,
Larry Barber

On Mon, 2005-10-17 at 10:58 -0500, user-e4253f8fc63b@xymon.invalid wrote:
Hi,

I have one host with some filesystems that are always 100% full. So
I  wanted to disable the red alarm by setting the warnlevel and
paniclevel  in the hobbit-clients.cfg to 101. The filesystems are /tsm/data,  /tsm/db1, and /tsm/db2. I tried the following definitions without success:

HOST=glaukos.x.y.z a) DISK /tsm/data 101 101

b) DISK /tsm/* 101 101

c) DISK %/tsm/* 101 101

d) DISK %^/tsm/* 101 101

e) DISK "%^/tsm/*" 101 101

The hobbit server always shows a red alarm. I use hobbit 4.1.2. What
did  I do wrong?

Regards, Dirk

list Wes Neal · Mon, 17 Oct 2005 13:05:03 -0400 ·
 I have this for an alert:

HOST=$opt1frame
        MAIL $ems1 REPEAT=20m RECOVERED
        MAIL page_ems_team1 at somewhere SERVICE=conn DURATION>10 DURATION<15

The host went down, and is still done.  I am getting the email every 20
minutes from the first line, but the second line never kicked off to my
pager.  Any ideas why it would not?

Thanks
Wes
list Larry Barber · Mon, 17 Oct 2005 13:40:26 -0400 (EDT) ·
You might have better luck with:

MAIL page_ems_team1 at somewhere SERVICE=conn DURATION>10 REPEAT=365d

-- or some other big number for the REPEAT, assuming you only want one
page (default is 1 day, I think). You might also check the notification
log to see if the rule ever fired. 
quoted from Wes Neal
Thanks,
Larry Barber


On Mon, 2005-10-17 at 12:05 -0500, user-4f272af8a740@xymon.invalid wrote:
 I have this for an alert:

HOST=$opt1frame         MAIL $ems1 REPEAT=20m RECOVERED         MAIL page_ems_team1 at somewhere SERVICE=conn DURATION>10
DURATION<15

The host went down, and is still done.  I am getting the email every
20 minutes from the first line, but the second line never kicked off to
my pager.  Any ideas why it would not?

Thanks Wes

list Henrik Størner · Mon, 17 Oct 2005 19:52:36 +0200 ·
quoted from Wes Neal
On Mon, Oct 17, 2005 at 01:05:03PM -0400, Wes Neal wrote:
 I have this for an alert:

HOST=$opt1frame
        MAIL $ems1 REPEAT=20m RECOVERED
        MAIL page_ems_team1 at somewhere SERVICE=conn DURATION>10 DURATION<15

The host went down, and is still done.  I am getting the email every 20
minutes from the first line, but the second line never kicked off to my
pager.  Any ideas why it would not?
Those two duration entries probably cancelled out all of the alerts.


Henrik
list Wes Neal · Mon, 17 Oct 2005 14:02:52 -0400 ·
Why would it cancel out all the alerts?

Wes 
quoted from Henrik Størner
-----Original Message-----
From: Henrik Stoerner [mailto:user-ce4a2c883f75@xymon.invalid] Sent: Monday, October 17, 2005 1:53 PM
To: user-ae9b8668bcde@xymon.invalid
Subject: Re: [hobbit] alert issue

On Mon, Oct 17, 2005 at 01:05:03PM -0400, Wes Neal wrote:
 I have this for an alert:

HOST=$opt1frame
        MAIL $ems1 REPEAT=20m RECOVERED
        MAIL page_ems_team1 at somewhere SERVICE=conn DURATION>10 DURATION<15

The host went down, and is still done.  I am getting the email every 20 minutes from the first line, but the second line never kicked off to my pager.  Any ideas why it would not?
Those two duration entries probably cancelled out all of the alerts.


Henrik
list Wes Neal · Mon, 17 Oct 2005 14:03:29 -0400 ·
Thanks Larry, I will try it that way. I do basically just want one page per
event that last more than 10 minutes.  I didnt think about just setting a
really high repeat value.

Wes 
quoted from Larry Barber
-----Original Message-----
From: user-7a6c75d6cc10@xymon.invalid [mailto:user-7a6c75d6cc10@xymon.invalid] Sent: Monday, October 17, 2005 1:40 PM
To: user-ae9b8668bcde@xymon.invalid
Subject: Re: [hobbit] alert issue

You might have better luck with:

MAIL page_ems_team1 at somewhere SERVICE=conn DURATION>10 REPEAT=365d

-- or some other big number for the REPEAT, assuming you only want one page
(default is 1 day, I think). You might also check the notification log to
see if the rule ever fired. 
Thanks,
Larry Barber


On Mon, 2005-10-17 at 12:05 -0500, user-4f272af8a740@xymon.invalid wrote:
 I have this for an alert:

HOST=$opt1frame         MAIL $ems1 REPEAT=20m RECOVERED         MAIL page_ems_team1 at somewhere SERVICE=conn DURATION>10
DURATION<15

The host went down, and is still done.  I am getting the email every 20 minutes from the first line, but the second line never kicked off to my pager.  Any ideas why it would not?

Thanks
Wes

list Dirk Kastens · Tue, 18 Oct 2005 08:53:20 +0200 ·
Hi,

Henrik Stoerner schrieb:
because ".*" matches any character, zero or more times.
Yes, there were two mistakes that I made: the first was the missing dot 
and the second the DEFAULT entries at the top of the file.

Thanks to everyone for their help.

Dirk
list David Welker · Wed, 11 Nov 2015 13:46:40 -0500 ·
I admit, this is my first attempt at alerting, so this may or may not be
user error.  It also could be an issue that is fixed in the latest upgrade
(I'm still on 4.3.21), but before I resort to that (which will be soon
enough), I wanted to make sure I was at least thinking along the right
line...

The cpu column is reporting green.
The entry in alerts.cfg includes a COLOR=yellow parameter.
Running the xymond_alert --test reports (as expected)...
Failed 'MAIN me at here COLOR=yellow EXSERVICE=conn,disk' (color)

If, however, I change the COLOR to red, or even if I just add it
(COLOR=yellow,red), I get the "Mail alert with command 'mail -s...'" line
rather than the failed.  Did I do something wrong?

Funny though, either way I don't actually get the email.

Thanks,
David
list Martin Lenko · Thu, 12 Nov 2015 03:10:34 +0000 ·
Hi David,
to specify multiple color values you will have to use regular expression.
Use something like this in your alert config:
COLOR=%(yellow|red)

The '%' character instructs xymon that following string is a regular
expression.

Regards,
Martin
quoted from David Welker

On 11 November 2015 at 18:46, David Welker <user-04cf53598626@xymon.invalid> wrote:
I admit, this is my first attempt at alerting, so this may or may not be
user error.  It also could be an issue that is fixed in the latest upgrade
(I'm still on 4.3.21), but before I resort to that (which will be soon
enough), I wanted to make sure I was at least thinking along the right
line...

The cpu column is reporting green.
The entry in alerts.cfg includes a COLOR=yellow parameter.
Running the xymond_alert --test reports (as expected)...
Failed 'MAIN me at here COLOR=yellow EXSERVICE=conn,disk' (color)

If, however, I change the COLOR to red, or even if I just add it
(COLOR=yellow,red), I get the "Mail alert with command 'mail -s...'" line
rather than the failed.  Did I do something wrong?

Funny though, either way I don't actually get the email.

Thanks,
David

list David Welker · Thu, 12 Nov 2015 09:44:48 -0500 ·
Martin,

Thanks!  Makes sense.  I guess the alerts.cfg.5.html file needs to be
changed?  That's where I was looking...
*COLOR=color[,color]* Rule matching an alert by color. Can be "red",
"yellow", or "purple". The forms "!red", "!yellow" and "!purple" can also
be used to NOT send an alert if the color is the specified one.

Well, that fixed my first issue.  Now when the cpu is reporting green, the
--test gives me a "Failed" message as expected, but with the cpu reporting
yellow, I get the same "Failed" message - failed due to color!

Even took xymoncmd out of the equation and the result is the same.
Took out EXSERVICE as well.

Now all I have in alerts.cfg:
PAGE=mypage
  MAIL user-4b28831ca997@xymon.invalid COLOR=%(red|yellow)

..yet when I run:
 xymond_alert --test server.com cpu
I get..
#### DATE TIME Failed 'MAIL user-4b28831ca997@xymon.invalid COLOR=%(red|yellow)' (color)

Any other suggestions?  For example, do I have to restart the server after
making changes to the alerts.cfg file?

Thanks,
David
quoted from Martin Lenko


On Wed, Nov 11, 2015 at 10:10 PM, Martin Lenko <user-024fe0c6d298@xymon.invalid> wrote:
Hi David,
to specify multiple color values you will have to use regular expression.
Use something like this in your alert config:
COLOR=%(yellow|red)

The '%' character instructs xymon that following string is a regular
expression.

Regards,
Martin

On 11 November 2015 at 18:46, David Welker <user-04cf53598626@xymon.invalid> wrote:
I admit, this is my first attempt at alerting, so this may or may not be
user error.  It also could be an issue that is fixed in the latest upgrade
(I'm still on 4.3.21), but before I resort to that (which will be soon
enough), I wanted to make sure I was at least thinking along the right
line...

The cpu column is reporting green.
The entry in alerts.cfg includes a COLOR=yellow parameter.
Running the xymond_alert --test reports (as expected)...
Failed 'MAIN me at here COLOR=yellow EXSERVICE=conn,disk' (color)

If, however, I change the COLOR to red, or even if I just add it
(COLOR=yellow,red), I get the "Mail alert with command 'mail -s...'" line
rather than the failed.  Did I do something wrong?

Funny though, either way I don't actually get the email.

Thanks,
David

list Japheth Cleaver · Thu, 12 Nov 2015 07:41:56 -0800 ·
quoted from David Welker
On Thu, November 12, 2015 6:44 am, David Welker wrote:
Thanks!  Makes sense.  I guess the alerts.cfg.5.html file needs to be
changed?  That's where I was looking...
*COLOR=color[,color]* Rule matching an alert by color. Can be "red",
"yellow", or "purple". The forms "!red", "!yellow" and "!purple" can also
be used to NOT send an alert if the color is the specified one.

Well, that fixed my first issue.  Now when the cpu is reporting green, the
--test gives me a "Failed" message as expected, but with the cpu reporting
yellow, I get the same "Failed" message - failed due to color!

Even took xymoncmd out of the equation and the result is the same.
Took out EXSERVICE as well.

Now all I have in alerts.cfg:
PAGE=mypage
  MAIL user-4b28831ca997@xymon.invalid COLOR=%(red|yellow)

..yet when I run:
 xymond_alert --test server.com cpu
I get..
#### DATE TIME Failed 'MAIL user-4b28831ca997@xymon.invalid COLOR=%(red|yellow)' (color)

Any other suggestions?  For example, do I have to restart the server after
making changes to the alerts.cfg file?

I believe the proper syntax there is simply:

PAGE=mypage COLOR=red,yellow
   MAIL user-4b28831ca997@xymon.invalid

COLOR can be a list of options, unlike most of the other filters (such as
EXSERVICE) which must either lexically match OR have to be prepended with
a '%' to enter "pcre-regex" mode. Additionally, COLOR= appears to not be
allowed as a per-recipient filter; in this case, it'll need to be on the
main line.

One easy way to check the syntax and rule logic is to run xymoncmd
xymond_alert --dump-config . What shows up in the result will be how the
alert system interprets it.

Also, to actually enable 'yellow' alerts, you'll want to ensure the color
is added to "$ALERTCOLORS" in xymonserver.cfg if it's not already.


HTH,
-jc
quoted from David Welker

Thanks,
David


On Wed, Nov 11, 2015 at 10:10 PM, Martin Lenko <user-024fe0c6d298@xymon.invalid> wrote:
Hi David,
to specify multiple color values you will have to use regular
expression.
Use something like this in your alert config:
COLOR=%(yellow|red)

The '%' character instructs xymon that following string is a regular
expression.

Regards,
Martin

On 11 November 2015 at 18:46, David Welker <user-04cf53598626@xymon.invalid> wrote:
I admit, this is my first attempt at alerting, so this may or may not
be
user error.  It also could be an issue that is fixed in the latest
upgrade
(I'm still on 4.3.21), but before I resort to that (which will be soon
enough), I wanted to make sure I was at least thinking along the right
line...

The cpu column is reporting green.
The entry in alerts.cfg includes a COLOR=yellow parameter.
Running the xymond_alert --test reports (as expected)...
Failed 'MAIN me at here COLOR=yellow EXSERVICE=conn,disk' (color)

If, however, I change the COLOR to red, or even if I just add it
(COLOR=yellow,red), I get the "Mail alert with command 'mail -s...'"
line
rather than the failed.  Did I do something wrong?

Funny though, either way I don't actually get the email.

Thanks,
David

list David Welker · Thu, 12 Nov 2015 12:37:13 -0500 ·
jc,

Using your idea...
PAGE=mypage COLOR=red,yellow
   MAIL user-4b28831ca997@xymon.invalid

...when cpu is green, yellow, purple, or red --test now always returns a
mail -s with CRITICAL (RED) message.

which may actually BE correct, since elsewhere on the page there is a red
status and I've now put the COLOR on the main line referencing the whole
page. In an attempt to narrow it back down to the one host, however, I used
HOST=^Server1*, instead of PAGE=, but STILL get the CRITICAL (RED) message
in the --test run, but actually got an email regarding the yellow cpu
status (and not for a different yellow status on the same page).

--dump-config returns...
HOST=^Server1* COLOR=yellow,red     <---- NOTICE REVERSED COLORS (see
above)??
MAIL user-4b28831ca997@xymon.invalid FORMAT=TEXT REPEAT=30

Interestingly enough, when I use a --color parameter with the --test, it
performs correctly...
If --color of cpu is yellow, --test returns a warning (YELLOW) message.
If --color of cpu is red, --test returns a CRITICAL (RED) message.
If --color of cpu is green or purple, --test returns a FAILED (due to
color) message.

By the way, my ALERTCOLORS="red,yellow,purple"

Could it possibly be that the --test feature needs more testing, and that
the alerting is actually working the way I expect it should?

Thanks,
David


On Thu, Nov 12, 2015 at 10:41 AM, J.C. Cleaver <user-87556346d4af@xymon.invalid>
quoted from Japheth Cleaver
wrote:
On Thu, November 12, 2015 6:44 am, David Welker wrote:
Thanks!  Makes sense.  I guess the alerts.cfg.5.html file needs to be
changed?  That's where I was looking...
*COLOR=color[,color]* Rule matching an alert by color. Can be "red",
"yellow", or "purple". The forms "!red", "!yellow" and "!purple" can also
be used to NOT send an alert if the color is the specified one.

Well, that fixed my first issue.  Now when the cpu is reporting green,
the
--test gives me a "Failed" message as expected, but with the cpu
reporting
yellow, I get the same "Failed" message - failed due to color!

Even took xymoncmd out of the equation and the result is the same.
Took out EXSERVICE as well.

Now all I have in alerts.cfg:
PAGE=mypage
  MAIL user-4b28831ca997@xymon.invalid COLOR=%(red|yellow)

..yet when I run:
 xymond_alert --test server.com cpu
I get..
#### DATE TIME Failed 'MAIL user-4b28831ca997@xymon.invalid COLOR=%(red|yellow)' (color)

Any other suggestions?  For example, do I have to restart the server
after
making changes to the alerts.cfg file?

I believe the proper syntax there is simply:

PAGE=mypage COLOR=red,yellow
   MAIL user-4b28831ca997@xymon.invalid

COLOR can be a list of options, unlike most of the other filters (such as
EXSERVICE) which must either lexically match OR have to be prepended with
a '%' to enter "pcre-regex" mode. Additionally, COLOR= appears to not be
allowed as a per-recipient filter; in this case, it'll need to be on the
main line.

One easy way to check the syntax and rule logic is to run xymoncmd
xymond_alert --dump-config . What shows up in the result will be how the
alert system interprets it.

Also, to actually enable 'yellow' alerts, you'll want to ensure the color
is added to "$ALERTCOLORS" in xymonserver.cfg if it's not already.


HTH,
-jc

Thanks,
David


On Wed, Nov 11, 2015 at 10:10 PM, Martin Lenko <user-024fe0c6d298@xymon.invalid>
wrote:
Hi David,
to specify multiple color values you will have to use regular
expression.
Use something like this in your alert config:
COLOR=%(yellow|red)

The '%' character instructs xymon that following string is a regular
expression.

Regards,
Martin

On 11 November 2015 at 18:46, David Welker <user-04cf53598626@xymon.invalid> wrote:
I admit, this is my first attempt at alerting, so this may or may not
be
user error.  It also could be an issue that is fixed in the latest
upgrade
(I'm still on 4.3.21), but before I resort to that (which will be soon
enough), I wanted to make sure I was at least thinking along the right
line...

The cpu column is reporting green.
The entry in alerts.cfg includes a COLOR=yellow parameter.
Running the xymond_alert --test reports (as expected)...
Failed 'MAIN me at here COLOR=yellow EXSERVICE=conn,disk' (color)

If, however, I change the COLOR to red, or even if I just add it
(COLOR=yellow,red), I get the "Mail alert with command 'mail -s...'"
line
rather than the failed.  Did I do something wrong?

Funny though, either way I don't actually get the email.

Thanks,
David

list Japheth Cleaver · Thu, 12 Nov 2015 11:56:02 -0800 ·
On 11/12/2015 9:37 AM, David Welker wrote:
jc,

Using your idea...
PAGE=mypage COLOR=red,yellow
   MAIL user-4b28831ca997@xymon.invalid <mailto:user-4b28831ca997@xymon.invalid>
quoted from David Welker

...when cpu is green, yellow, purple, or red --test now always returns a mail -s with CRITICAL (RED) message.

which may actually BE correct, since elsewhere on the page there is a red status and I've now put the COLOR on the main line referencing the whole page. In an attempt to narrow it back down to the one host, however, I used HOST=^Server1*, instead of PAGE=, but STILL get the CRITICAL (RED) message in the --test run, but actually got an email regarding the yellow cpu status (and not for a different yellow status on the same page).

--dump-config returns...
HOST=^Server1* COLOR=yellow,red     <---- NOTICE REVERSED COLORS (see above)??

MAIL user-4b28831ca997@xymon.invalid <mailto:user-4b28831ca997@xymon.invalid> FORMAT=TEXT REPEAT=30
quoted from David Welker

Interestingly enough, when I use a --color parameter with the --test, it performs correctly...
If --color of cpu is yellow, --test returns a warning (YELLOW) message.
If --color of cpu is red, --test returns a CRITICAL (RED) message.
If --color of cpu is green or purple, --test returns a FAILED (due to color) message.

By the way, my ALERTCOLORS="red,yellow,purple"

Could it possibly be that the --test feature needs more testing, and that the alerting is actually working the way I expect it should?
Actually, this could be more of a documentation issue I suppose.

xymond_alert when in --test mode should really always have a --color= specified (so that it knows what to test for).

Based on how it responded after you'd added that in, I believe it's working as intended there.

Regards,
-jc