Xymon Mailing List Archive search

two ack questions

list Sean R. Clark
Wed, 16 May 2007 11:38:23 -0400
Message-Id: <000b01c797d0$3d811550$user-f608fa63d7d9@xymon.invalid>

More info on this crazy page bomb

Looking at notifications.log
 

  grep hostname-swt09a-02.domain.com notifications.log | wc -l
664

All 664 here have the same timestamp

Wed May 16 10:57:17 2007 hostname-swt09a-02.domain.com:if_stat (0.0.0.0)
whatever at com[819] 1179327437 0
Wed May 16 10:57:17 2007 hostname-swt09a-02.domain.com:if_stat (0.0.0.0)
whatever at com[819] 1179327437 0
Wed May 16 10:57:17 2007 hostname-swt09a-02.domain.com:if_stat (0.0.0.0)
whatever at com[819] 1179327437 0
Wed May 16 10:57:17 2007 hostname-swt09a-02.domain.com:if_stat (0.0.0.0)
whatever at com[819] 1179327437 0
Wed May 16 10:57:17 2007 hostname-swt09a-02.domain.com:if_stat (0.0.0.0)
whatever at com[819] 1179327437 0
Wed May 16 10:57:17 2007 hostname-swt09a-02.domain.com:if_stat (0.0.0.0)
whatever at com[819] 1179327437 0
Wed May 16 10:57:17 2007 hostname-swt09a-02.domain.com:if_stat (0.0.0.0)
whatever at com[819] 1179327437 0


So it's sending them, it's not something with the mailserver/mailqueue

It's happening about 2-3 times a day where it will send 300-700 alerts for 1
host 


The infopage for this host/test combination shows:

Service	Recipient	1st Delay	Stop after	Repeat	Time of Day
Colors
if_stat	whatever at com (U)	-	-	2h 	-	red


The ack code send on the page is clearly 0, but the hobbitdboard lists a
valid ack code


Any ideas where I go next? I can't have hobbit sending 300-700 alerts for a
host/test combination


-----Original Message-----
From: Sean R. Clark [mailto:user-94e09d797e16@xymon.invalid] 
Sent: Tuesday, May 15, 2007 12:54 PM
To: user-ae9b8668bcde@xymon.invalid
Subject: RE: [hobbit] two ack questions

 
When the status goes back to a non-critical state, hobbitd sends a 
recovery
message on the "alert" channel, so the
alerting module can send a "recovered" notice (if this has been configured).
 
No I understand this part - I have it configured and working

But when I enabled the bbpage module in hobbitlaunch, I got a bunch of
messages that said RED CRITICAL but the ACK code passed to it was 0


Subject: Hobbit [0] hostname-swt09a-02.domain.com:if_stat CRITICAL (RED)
(body) red Mon May 14 13:15:18 2007
	  results of test bleh  bleh


This wasn't a recovery - it was red when I started bbpage

It hasn't happened since yesterday, something with having the paging module
DISABLED and then restarting hobbit? I dunno