Xymon Mailing List Archive search

LOG IGNORE matching

11 messages in this thread

list Steve Holmes · Mon, 14 May 2007 17:00:14 -0400 ·
I may still not be understanding how the pattern matching is done for the
LOG keyword.

I have:

        LOG /var/adm/messages %(?-i)auth.error COLOR=yellow
IGNORE="%(?-i)sshd|flavor_basic: (null)"

Which I think should mean: look for the string "auth.error" in
/var/adm/messages and then ignore lines with "sshd" OR "flavor_basic:
(null)" in them.

If I *only* have IGNORE=sshd that seems to work, but I really need to ignore
both (at least for my testing), but when I do it as above, I get yellow
screens for auth.error lines even if they have the string "sshd" in them.

Am I missing something?

BTW, in BB I have a very long list of strings to ignore. Is there an easier
way to do that in hobbit other than to put each string into an IGNORE
clause?

Thanks,
Steve Holmes
list John G · Mon, 14 May 2007 19:22:34 -0400 ·
quoted from Steve Holmes
On 5/14/07, Steve Holmes <user-ec1bf77b1b44@xymon.invalid> wrote:
I may still not be understanding how the pattern matching is done for the
LOG keyword.

I have:

        LOG /var/adm/messages %(?-i)auth.error COLOR=yellow
IGNORE="%(?-i)sshd|flavor_basic: (null)"

Which I think should mean: look for the string "auth.error" in
/var/adm/messages and then ignore lines with "sshd" OR "flavor_basic:
(null)" in them.

If I *only* have IGNORE=sshd that seems to work, but I really need to ignore
both (at least for my testing), but when I do it as above, I get yellow
screens for auth.error lines even if they have the string "sshd" in them.

Am I missing something?

BTW, in BB I have a very long list of strings to ignore. Is there an easier
way to do that in hobbit other than to put each string into an IGNORE
clause?

Thanks,
Steve Holmes

Try adding back slashes.
IGNORE="%(?-i)sshd|flavor_basic: \(null\)"

$ pcretest
PCRE version 6.7 04-Jul-2006

  re> /(?-i)sshd|flavor_basic: (null)/
data> flavor_basic: (null)
No match
$ pcretest
PCRE version 6.7 04-Jul-2006

  re> /(?-i)sshd|flavor_basic: \(null\)/
data> flavor_basic: (null)
 0: flavor_basic: (null)
data> sshd
 0: sshd
list Sean R. Clark · Mon, 14 May 2007 23:26:23 -0400 ·

First - what does a -1 mean for acktime,disabletime, and cookie mean when I
run hobbitdboard?

I see that they(-1) are valid from the man page, but I can't seem to find
what they mean...


Second - what would cause an alert to go out the hobbitdalert channel but
not have a valid ack code?


I got this today:

Subject: Hobbit [0] hostname-swt09a-02.domain.com:if_stat CRITICAL (RED)
(body) red Mon May 14 13:15:18 2007
	  results of test bleh  bleh


Is that normal not to have an ackcode? And if not, how can I track down why
it is ?


-Sean
list Tod Hansmann · Tue, 15 May 2007 08:52:04 -0600 ·
I can only answer your second question, and that only with a theory.  We
used to get this in bb all the time as well as hobbit now.  We only get
it with custom scripts and we thought it might be tests that get added
to the display because of an alert, but have no corresponding test
listed in bb-hosts, so the bb/hobbit daemon doesn't generate a test for
something it's not even keeping track of.

That would make sense to me, since you may not WANT it to do so in a
custom script, but again it's only a theory.  We haven't tested it.
Just noticed patterns.

One of these days I'm going to get time and just dive into the code.

Tod Hansmann
Network Engineer
quoted from Sean R. Clark
 
 
-----Original Message-----
From: Sean R. Clark [mailto:user-94e09d797e16@xymon.invalid] 
Sent: Monday, May 14, 2007 9:26 PM
To: user-ae9b8668bcde@xymon.invalid
Subject: [hobbit] two ack questions


First - what does a -1 mean for acktime,disabletime, and cookie mean
when I
run hobbitdboard?

I see that they(-1) are valid from the man page, but I can't seem to
find
what they mean...


Second - what would cause an alert to go out the hobbitdalert channel
but
not have a valid ack code?


I got this today:

Subject: Hobbit [0] hostname-swt09a-02.domain.com:if_stat CRITICAL (RED)
(body) red Mon May 14 13:15:18 2007
	  results of test bleh  bleh


Is that normal not to have an ackcode? And if not, how can I track down
why
it is ?


-Sean
list Sean R. Clark · Tue, 15 May 2007 11:14:55 -0400 ·
Well I would think that would be it, except that other if_stat generate ACK
codes

And on the issue of question #1:
Also, searching the mailing list archives(or google) for -1 or "-1" is hard
;)


-Sean 
quoted from Tod Hansmann

-----Original Message-----
From: Tod Hansmann [mailto:user-b6e28cb93fa4@xymon.invalid] 
Sent: Tuesday, May 15, 2007 10:52 AM
To: user-ae9b8668bcde@xymon.invalid
Subject: RE: [hobbit] two ack questions

I can only answer your second question, and that only with a theory.  We
used to get this in bb all the time as well as hobbit now.  We only get it
with custom scripts and we thought it might be tests that get added to the
display because of an alert, but have no corresponding test listed in
bb-hosts, so the bb/hobbit daemon doesn't generate a test for something it's
not even keeping track of.

That would make sense to me, since you may not WANT it to do so in a custom
script, but again it's only a theory.  We haven't tested it.
Just noticed patterns.

One of these days I'm going to get time and just dive into the code.

Tod Hansmann
Network Engineer
 
 
-----Original Message-----
From: Sean R. Clark [mailto:user-94e09d797e16@xymon.invalid]
Sent: Monday, May 14, 2007 9:26 PM
To: user-ae9b8668bcde@xymon.invalid
Subject: [hobbit] two ack questions


First - what does a -1 mean for acktime,disabletime, and cookie mean
when I
run hobbitdboard?

I see that they(-1) are valid from the man page, but I can't seem to
find
what they mean...


Second - what would cause an alert to go out the hobbitdalert channel
but
not have a valid ack code?


I got this today:

Subject: Hobbit [0] hostname-swt09a-02.domain.com:if_stat CRITICAL (RED)
(body) red Mon May 14 13:15:18 2007
	  results of test bleh  bleh


Is that normal not to have an ackcode? And if not, how can I track down
why
it is ?


-Sean
list Henrik Størner · Tue, 15 May 2007 17:30:52 +0200 ·
quoted from Sean R. Clark
On Mon, May 14, 2007 at 11:26:23PM -0400, Sean R. Clark wrote:
First - what does a -1 mean for acktime,disabletime, and cookie mean when I
run hobbitdboard?
A green status cannot be ack'ed and doesn't have a cookie - so "acktime"
and "cookie" are reported as "-1".

"disabletime" is set to -1 when the test is "disabled until OK".
quoted from Sean R. Clark
Second - what would cause an alert to go out the hobbitdalert channel but
not have a valid ack code?
When the status goes back to a non-critical state, hobbitd sends a recovery 
message on the "alert" channel, so the alerting module can send a
"recovered" notice (if this has been configured).
 

Regards,
Henrik
list Mario Andre · Tue, 15 May 2007 13:02:50 -0300 ·
Hi,

Checking the host-ack checkbox on the form of critical systems page do not
cause the ack in all the tests alarms from a host.

Is this working for someone?

Thanks in advance,

Mario.
quoted from Henrik Størner


On 5/15/07, Henrik Stoerner <user-ce4a2c883f75@xymon.invalid> wrote:
On Mon, May 14, 2007 at 11:26:23PM -0400, Sean R. Clark wrote:
First - what does a -1 mean for acktime,disabletime, and cookie mean
when I
run hobbitdboard?
A green status cannot be ack'ed and doesn't have a cookie - so "acktime"
and "cookie" are reported as "-1".

"disabletime" is set to -1 when the test is "disabled until OK".
Second - what would cause an alert to go out the hobbitdalert channel
but
not have a valid ack code?
When the status goes back to a non-critical state, hobbitd sends a
recovery
message on the "alert" channel, so the alerting module can send a
"recovered" notice (if this has been configured).


Regards,
Henrik

list Sean R. Clark · Tue, 15 May 2007 12:54:22 -0400 ·
quoted from Mario Andre
 
When the status goes back to a non-critical state, hobbitd sends a recovery
message on the "alert" channel, so the
alerting module can send a "recovered" notice (if this has been configured).
 
No I understand this part - I have it configured and working

But when I enabled the bbpage module in hobbitlaunch, I got a bunch of
messages that said RED CRITICAL but the ACK code passed to it was 0
quoted from Sean R. Clark


Subject: Hobbit [0] hostname-swt09a-02.domain.com:if_stat CRITICAL (RED)
(body) red Mon May 14 13:15:18 2007
	  results of test bleh  bleh


This wasn't a recovery - it was red when I started bbpage

It hasn't happened since yesterday, something with having the paging module
DISABLED and then restarting hobbit? I dunno
list Steve Holmes · Tue, 15 May 2007 13:35:34 -0400 ·
Thanks for the hint. I'm still having a little trouble getting it to do what
I want, but at least no I know about pcretest, which I didn't before.

Thanks,
Steve.
quoted from John G


On 5/14/07, John G <user-7da77b391823@xymon.invalid> wrote:
On 5/14/07, Steve Holmes <user-ec1bf77b1b44@xymon.invalid> wrote:
I may still not be understanding how the pattern matching is done for
the
LOG keyword.

I have:

        LOG /var/adm/messages %(?-i)auth.error COLOR=yellow
IGNORE="%(?-i)sshd|flavor_basic: (null)"

Which I think should mean: look for the string "auth.error" in
/var/adm/messages and then ignore lines with "sshd" OR "flavor_basic:
(null)" in them.

If I *only* have IGNORE=sshd that seems to work, but I really need to
ignore
both (at least for my testing), but when I do it as above, I get yellow
screens for auth.error lines even if they have the string "sshd" in
them.

Am I missing something?

BTW, in BB I have a very long list of strings to ignore. Is there an
easier
way to do that in hobbit other than to put each string into an IGNORE
clause?

Thanks,
Steve Holmes

Try adding back slashes.
IGNORE="%(?-i)sshd|flavor_basic: \(null\)"

$ pcretest
PCRE version 6.7 04-Jul-2006

  re> /(?-i)sshd|flavor_basic: (null)/
data> flavor_basic: (null)
No match
$ pcretest
PCRE version 6.7 04-Jul-2006

  re> /(?-i)sshd|flavor_basic: \(null\)/
data> flavor_basic: (null)
0: flavor_basic: (null)
data> sshd
0: sshd

-- 

I believe I found the missing link between animal and civilized man. It is
us. -Konrad Lorenz, ethologist, Nobel laureate (1903-1989)

We in America do not have government by the majority. We have government by
the majority who participate. -Thomas Jefferson, third US president,
architect and author (1743-1826)
list Sean R. Clark · Tue, 15 May 2007 13:42:36 -0400 ·
I spoke too soon

I just got 103 emails with 

Hobbit [0] rochnygso-omrdd-cpe01.nyroc.rr.com:if_stat CRITICAL (RED)
red Mon May 14 20:19:26 2007


When I look at the hobbitdboard for the host, it shows a correct ack code


rochnygso-omrdd-cpe01.nyroc.rr.com|if_stat|red||1179233720|1179249886|117925
1686|0|0|10.10.8.180|419285|red Tue May 15 13:24:43 2007 


Hobbitd_alert -test shows it matching just 1 rule

00008773 2007-05-15 13:31:47 *** Match with
'HOST=%.*-pe-rtr.*|.*-ons.*|.*-p-rtr.*|.*-osc.*|.*-adm.*|.*-nypa-.*|.*-amp.*
|.*-omrdd-.*
|.*-pe0.*|.*-pe1.* TIME=12345:0800:2159' ***
00008773 2007-05-15 13:31:47 Matching host:service:page
'rochnygso-omrdd-cpe01.nyroc.rr.com:if_stat:regionalnet/omrdd' against rule 
line 534
00008773 2007-05-15 13:31:47 *** Match with 'MAIL user-379a78b7e224@xymon.invalid
REPEAT=60 COLOR=red RECOVERED' ***
00008773 2007-05-15 13:31:47 Mail alert with command '/var/mail//sclark
"Hobbit [12345] rochnygso-omrdd-cpe01.nyroc.rr.com:if_stat CRITICAL (RED)"
user-7bfd2181a0ed@xymon.invalid'


Is there something in my rule setup that could be causing all these extra
alerts?


-Sean
quoted from Sean R. Clark
-----Original Message-----
From: Sean R. Clark [mailto:user-94e09d797e16@xymon.invalid] 
Sent: Tuesday, May 15, 2007 12:54 PM
To: user-ae9b8668bcde@xymon.invalid
Subject: RE: [hobbit] two ack questions

 
When the status goes back to a non-critical state, hobbitd sends a 
recovery
message on the "alert" channel, so the
alerting module can send a "recovered" notice (if this has been configured).
 
No I understand this part - I have it configured and working

But when I enabled the bbpage module in hobbitlaunch, I got a bunch of
messages that said RED CRITICAL but the ACK code passed to it was 0


Subject: Hobbit [0] hostname-swt09a-02.domain.com:if_stat CRITICAL (RED)
(body) red Mon May 14 13:15:18 2007
	  results of test bleh  bleh


This wasn't a recovery - it was red when I started bbpage

It hasn't happened since yesterday, something with having the paging module
DISABLED and then restarting hobbit? I dunno
list Sean R. Clark · Wed, 16 May 2007 11:38:23 -0400 ·
More info on this crazy page bomb

Looking at notifications.log
 

  grep hostname-swt09a-02.domain.com notifications.log | wc -l
664

All 664 here have the same timestamp

Wed May 16 10:57:17 2007 hostname-swt09a-02.domain.com:if_stat (0.0.0.0)
whatever at com[819] 1179327437 0
Wed May 16 10:57:17 2007 hostname-swt09a-02.domain.com:if_stat (0.0.0.0)
whatever at com[819] 1179327437 0
Wed May 16 10:57:17 2007 hostname-swt09a-02.domain.com:if_stat (0.0.0.0)
whatever at com[819] 1179327437 0
Wed May 16 10:57:17 2007 hostname-swt09a-02.domain.com:if_stat (0.0.0.0)
whatever at com[819] 1179327437 0
Wed May 16 10:57:17 2007 hostname-swt09a-02.domain.com:if_stat (0.0.0.0)
whatever at com[819] 1179327437 0
Wed May 16 10:57:17 2007 hostname-swt09a-02.domain.com:if_stat (0.0.0.0)
whatever at com[819] 1179327437 0
Wed May 16 10:57:17 2007 hostname-swt09a-02.domain.com:if_stat (0.0.0.0)
whatever at com[819] 1179327437 0


So it's sending them, it's not something with the mailserver/mailqueue

It's happening about 2-3 times a day where it will send 300-700 alerts for 1
host 


The infopage for this host/test combination shows:

Service	Recipient	1st Delay	Stop after	Repeat	Time of Day
Colors
if_stat	whatever at com (U)	-	-	2h 	-	red


The ack code send on the page is clearly 0, but the hobbitdboard lists a
valid ack code


Any ideas where I go next? I can't have hobbit sending 300-700 alerts for a
host/test combination
quoted from Sean R. Clark


-----Original Message-----
From: Sean R. Clark [mailto:user-94e09d797e16@xymon.invalid] 
Sent: Tuesday, May 15, 2007 12:54 PM
To: user-ae9b8668bcde@xymon.invalid
Subject: RE: [hobbit] two ack questions

 
When the status goes back to a non-critical state, hobbitd sends a 
recovery
message on the "alert" channel, so the
alerting module can send a "recovered" notice (if this has been configured).
 
No I understand this part - I have it configured and working

But when I enabled the bbpage module in hobbitlaunch, I got a bunch of
messages that said RED CRITICAL but the ACK code passed to it was 0


Subject: Hobbit [0] hostname-swt09a-02.domain.com:if_stat CRITICAL (RED)
(body) red Mon May 14 13:15:18 2007
	  results of test bleh  bleh


This wasn't a recovery - it was red when I started bbpage

It hasn't happened since yesterday, something with having the paging module
DISABLED and then restarting hobbit? I dunno