LOG IGNORE matching
list Steve Holmes
I may still not be understanding how the pattern matching is done for the
LOG keyword.
I have:
LOG /var/adm/messages %(?-i)auth.error COLOR=yellow
IGNORE="%(?-i)sshd|flavor_basic: (null)"
Which I think should mean: look for the string "auth.error" in
/var/adm/messages and then ignore lines with "sshd" OR "flavor_basic:
(null)" in them.
If I *only* have IGNORE=sshd that seems to work, but I really need to ignore
both (at least for my testing), but when I do it as above, I get yellow
screens for auth.error lines even if they have the string "sshd" in them.
Am I missing something?
BTW, in BB I have a very long list of strings to ignore. Is there an easier
way to do that in hobbit other than to put each string into an IGNORE
clause?
Thanks,
Steve Holmes
list John G
▸
On 5/14/07, Steve Holmes <user-ec1bf77b1b44@xymon.invalid> wrote:
I may still not be understanding how the pattern matching is done for the
LOG keyword.
I have:
LOG /var/adm/messages %(?-i)auth.error COLOR=yellow
IGNORE="%(?-i)sshd|flavor_basic: (null)"
Which I think should mean: look for the string "auth.error" in
/var/adm/messages and then ignore lines with "sshd" OR "flavor_basic:
(null)" in them.
If I *only* have IGNORE=sshd that seems to work, but I really need to ignore
both (at least for my testing), but when I do it as above, I get yellow
screens for auth.error lines even if they have the string "sshd" in them.
Am I missing something?
BTW, in BB I have a very long list of strings to ignore. Is there an easier
way to do that in hobbit other than to put each string into an IGNORE
clause?
Thanks,
Steve Holmes
Try adding back slashes. IGNORE="%(?-i)sshd|flavor_basic: \(null\)" $ pcretest PCRE version 6.7 04-Jul-2006 re> /(?-i)sshd|flavor_basic: (null)/ data> flavor_basic: (null) No match $ pcretest PCRE version 6.7 04-Jul-2006 re> /(?-i)sshd|flavor_basic: \(null\)/ data> flavor_basic: (null) 0: flavor_basic: (null) data> sshd 0: sshd
list Sean R. Clark
First - what does a -1 mean for acktime,disabletime, and cookie mean when I run hobbitdboard? I see that they(-1) are valid from the man page, but I can't seem to find what they mean... Second - what would cause an alert to go out the hobbitdalert channel but not have a valid ack code? I got this today: Subject: Hobbit [0] hostname-swt09a-02.domain.com:if_stat CRITICAL (RED) (body) red Mon May 14 13:15:18 2007 results of test bleh bleh Is that normal not to have an ackcode? And if not, how can I track down why it is ? -Sean
list Tod Hansmann
I can only answer your second question, and that only with a theory. We used to get this in bb all the time as well as hobbit now. We only get it with custom scripts and we thought it might be tests that get added to the display because of an alert, but have no corresponding test listed in bb-hosts, so the bb/hobbit daemon doesn't generate a test for something it's not even keeping track of. That would make sense to me, since you may not WANT it to do so in a custom script, but again it's only a theory. We haven't tested it. Just noticed patterns. One of these days I'm going to get time and just dive into the code. Tod Hansmann Network Engineer
▸
-----Original Message-----
From: Sean R. Clark [mailto:user-94e09d797e16@xymon.invalid]
Sent: Monday, May 14, 2007 9:26 PM
To: user-ae9b8668bcde@xymon.invalid
Subject: [hobbit] two ack questions
First - what does a -1 mean for acktime,disabletime, and cookie mean
when I
run hobbitdboard?
I see that they(-1) are valid from the man page, but I can't seem to
find
what they mean...
Second - what would cause an alert to go out the hobbitdalert channel
but
not have a valid ack code?
I got this today:
Subject: Hobbit [0] hostname-swt09a-02.domain.com:if_stat CRITICAL (RED)
(body) red Mon May 14 13:15:18 2007
results of test bleh bleh
Is that normal not to have an ackcode? And if not, how can I track down
why
it is ?
-Sean
list Sean R. Clark
Well I would think that would be it, except that other if_stat generate ACK codes And on the issue of question #1: Also, searching the mailing list archives(or google) for -1 or "-1" is hard ;) -Sean
▸
-----Original Message-----
From: Tod Hansmann [mailto:user-b6e28cb93fa4@xymon.invalid]
Sent: Tuesday, May 15, 2007 10:52 AM
To: user-ae9b8668bcde@xymon.invalid
Subject: RE: [hobbit] two ack questions
I can only answer your second question, and that only with a theory. We
used to get this in bb all the time as well as hobbit now. We only get it
with custom scripts and we thought it might be tests that get added to the
display because of an alert, but have no corresponding test listed in
bb-hosts, so the bb/hobbit daemon doesn't generate a test for something it's
not even keeping track of.
That would make sense to me, since you may not WANT it to do so in a custom
script, but again it's only a theory. We haven't tested it.
Just noticed patterns.
One of these days I'm going to get time and just dive into the code.
Tod Hansmann
Network Engineer
-----Original Message-----
From: Sean R. Clark [mailto:user-94e09d797e16@xymon.invalid]
Sent: Monday, May 14, 2007 9:26 PM
To: user-ae9b8668bcde@xymon.invalid
Subject: [hobbit] two ack questions
First - what does a -1 mean for acktime,disabletime, and cookie mean
when I
run hobbitdboard?
I see that they(-1) are valid from the man page, but I can't seem to
find
what they mean...
Second - what would cause an alert to go out the hobbitdalert channel
but
not have a valid ack code?
I got this today:
Subject: Hobbit [0] hostname-swt09a-02.domain.com:if_stat CRITICAL (RED)
(body) red Mon May 14 13:15:18 2007
results of test bleh bleh
Is that normal not to have an ackcode? And if not, how can I track down
why
it is ?
-Sean
list Henrik Størner
▸
On Mon, May 14, 2007 at 11:26:23PM -0400, Sean R. Clark wrote:
First - what does a -1 mean for acktime,disabletime, and cookie mean when I run hobbitdboard?
A green status cannot be ack'ed and doesn't have a cookie - so "acktime" and "cookie" are reported as "-1". "disabletime" is set to -1 when the test is "disabled until OK".
▸
Second - what would cause an alert to go out the hobbitdalert channel but not have a valid ack code?
When the status goes back to a non-critical state, hobbitd sends a recovery message on the "alert" channel, so the alerting module can send a "recovered" notice (if this has been configured). Regards, Henrik
list Mario Andre
Hi, Checking the host-ack checkbox on the form of critical systems page do not cause the ack in all the tests alarms from a host. Is this working for someone? Thanks in advance, Mario.
▸
On 5/15/07, Henrik Stoerner <user-ce4a2c883f75@xymon.invalid> wrote:On Mon, May 14, 2007 at 11:26:23PM -0400, Sean R. Clark wrote:First - what does a -1 mean for acktime,disabletime, and cookie mean when I run hobbitdboard?A green status cannot be ack'ed and doesn't have a cookie - so "acktime" and "cookie" are reported as "-1". "disabletime" is set to -1 when the test is "disabled until OK".Second - what would cause an alert to go out the hobbitdalert channel but not have a valid ack code?When the status goes back to a non-critical state, hobbitd sends a recovery message on the "alert" channel, so the alerting module can send a "recovered" notice (if this has been configured). Regards, Henrik
list Sean R. Clark
▸
When the status goes back to a non-critical state, hobbitd sends a recovery message on the "alert" channel, so the alerting module can send a "recovered" notice (if this has been configured).
No I understand this part - I have it configured and working
But when I enabled the bbpage module in hobbitlaunch, I got a bunch of
messages that said RED CRITICAL but the ACK code passed to it was 0
▸
Subject: Hobbit [0] hostname-swt09a-02.domain.com:if_stat CRITICAL (RED)
(body) red Mon May 14 13:15:18 2007
results of test bleh bleh
This wasn't a recovery - it was red when I started bbpage
It hasn't happened since yesterday, something with having the paging module
DISABLED and then restarting hobbit? I dunno
list Steve Holmes
Thanks for the hint. I'm still having a little trouble getting it to do what I want, but at least no I know about pcretest, which I didn't before. Thanks, Steve.
▸
On 5/14/07, John G <user-7da77b391823@xymon.invalid> wrote:On 5/14/07, Steve Holmes <user-ec1bf77b1b44@xymon.invalid> wrote:I may still not be understanding how the pattern matching is done for the LOG keyword. I have: LOG /var/adm/messages %(?-i)auth.error COLOR=yellow IGNORE="%(?-i)sshd|flavor_basic: (null)" Which I think should mean: look for the string "auth.error" in /var/adm/messages and then ignore lines with "sshd" OR "flavor_basic: (null)" in them. If I *only* have IGNORE=sshd that seems to work, but I really need to ignore both (at least for my testing), but when I do it as above, I get yellow screens for auth.error lines even if they have the string "sshd" in them. Am I missing something? BTW, in BB I have a very long list of strings to ignore. Is there an easier way to do that in hobbit other than to put each string into an IGNORE clause? Thanks, Steve HolmesTry adding back slashes. IGNORE="%(?-i)sshd|flavor_basic: \(null\)" $ pcretest PCRE version 6.7 04-Jul-2006 re> /(?-i)sshd|flavor_basic: (null)/ data> flavor_basic: (null) No match $ pcretest PCRE version 6.7 04-Jul-2006 re> /(?-i)sshd|flavor_basic: \(null\)/ data> flavor_basic: (null) 0: flavor_basic: (null) data> sshd 0: sshd
--
I believe I found the missing link between animal and civilized man. It is
us. -Konrad Lorenz, ethologist, Nobel laureate (1903-1989)
We in America do not have government by the majority. We have government by
the majority who participate. -Thomas Jefferson, third US president,
architect and author (1743-1826)
list Sean R. Clark
I spoke too soon I just got 103 emails with Hobbit [0] rochnygso-omrdd-cpe01.nyroc.rr.com:if_stat CRITICAL (RED) red Mon May 14 20:19:26 2007 When I look at the hobbitdboard for the host, it shows a correct ack code rochnygso-omrdd-cpe01.nyroc.rr.com|if_stat|red||1179233720|1179249886|117925 1686|0|0|10.10.8.180|419285|red Tue May 15 13:24:43 2007 Hobbitd_alert -test shows it matching just 1 rule 00008773 2007-05-15 13:31:47 *** Match with 'HOST=%.*-pe-rtr.*|.*-ons.*|.*-p-rtr.*|.*-osc.*|.*-adm.*|.*-nypa-.*|.*-amp.* |.*-omrdd-.* |.*-pe0.*|.*-pe1.* TIME=12345:0800:2159' *** 00008773 2007-05-15 13:31:47 Matching host:service:page 'rochnygso-omrdd-cpe01.nyroc.rr.com:if_stat:regionalnet/omrdd' against rule line 534 00008773 2007-05-15 13:31:47 *** Match with 'MAIL user-379a78b7e224@xymon.invalid REPEAT=60 COLOR=red RECOVERED' *** 00008773 2007-05-15 13:31:47 Mail alert with command '/var/mail//sclark "Hobbit [12345] rochnygso-omrdd-cpe01.nyroc.rr.com:if_stat CRITICAL (RED)" user-7bfd2181a0ed@xymon.invalid' Is there something in my rule setup that could be causing all these extra alerts? -Sean
▸
-----Original Message-----
From: Sean R. Clark [mailto:user-94e09d797e16@xymon.invalid]
Sent: Tuesday, May 15, 2007 12:54 PM
To: user-ae9b8668bcde@xymon.invalid
Subject: RE: [hobbit] two ack questions
When the status goes back to a non-critical state, hobbitd sends a recovery message on the "alert" channel, so the alerting module can send a "recovered" notice (if this has been configured).
No I understand this part - I have it configured and working But when I enabled the bbpage module in hobbitlaunch, I got a bunch of messages that said RED CRITICAL but the ACK code passed to it was 0 Subject: Hobbit [0] hostname-swt09a-02.domain.com:if_stat CRITICAL (RED) (body) red Mon May 14 13:15:18 2007 results of test bleh bleh This wasn't a recovery - it was red when I started bbpage It hasn't happened since yesterday, something with having the paging module DISABLED and then restarting hobbit? I dunno
list Sean R. Clark
More info on this crazy page bomb Looking at notifications.log grep hostname-swt09a-02.domain.com notifications.log | wc -l 664 All 664 here have the same timestamp Wed May 16 10:57:17 2007 hostname-swt09a-02.domain.com:if_stat (0.0.0.0) whatever at com[819] 1179327437 0 Wed May 16 10:57:17 2007 hostname-swt09a-02.domain.com:if_stat (0.0.0.0) whatever at com[819] 1179327437 0 Wed May 16 10:57:17 2007 hostname-swt09a-02.domain.com:if_stat (0.0.0.0) whatever at com[819] 1179327437 0 Wed May 16 10:57:17 2007 hostname-swt09a-02.domain.com:if_stat (0.0.0.0) whatever at com[819] 1179327437 0 Wed May 16 10:57:17 2007 hostname-swt09a-02.domain.com:if_stat (0.0.0.0) whatever at com[819] 1179327437 0 Wed May 16 10:57:17 2007 hostname-swt09a-02.domain.com:if_stat (0.0.0.0) whatever at com[819] 1179327437 0 Wed May 16 10:57:17 2007 hostname-swt09a-02.domain.com:if_stat (0.0.0.0) whatever at com[819] 1179327437 0 So it's sending them, it's not something with the mailserver/mailqueue It's happening about 2-3 times a day where it will send 300-700 alerts for 1 host The infopage for this host/test combination shows: Service Recipient 1st Delay Stop after Repeat Time of Day Colors if_stat whatever at com (U) - - 2h - red The ack code send on the page is clearly 0, but the hobbitdboard lists a valid ack code Any ideas where I go next? I can't have hobbit sending 300-700 alerts for a host/test combination
▸
-----Original Message-----
From: Sean R. Clark [mailto:user-94e09d797e16@xymon.invalid]
Sent: Tuesday, May 15, 2007 12:54 PM
To: user-ae9b8668bcde@xymon.invalid
Subject: RE: [hobbit] two ack questions
When the status goes back to a non-critical state, hobbitd sends a recovery message on the "alert" channel, so the alerting module can send a "recovered" notice (if this has been configured).
No I understand this part - I have it configured and working But when I enabled the bbpage module in hobbitlaunch, I got a bunch of messages that said RED CRITICAL but the ACK code passed to it was 0 Subject: Hobbit [0] hostname-swt09a-02.domain.com:if_stat CRITICAL (RED) (body) red Mon May 14 13:15:18 2007 results of test bleh bleh This wasn't a recovery - it was red when I started bbpage It hasn't happened since yesterday, something with having the paging module DISABLED and then restarting hobbit? I dunno