Xymon Mailing List Archive search

cpu alerts

list Henrik Størner
Tue, 8 Aug 2006 23:00:39 +0200
Message-Id: <user-1a3fbb1a1127@xymon.invalid>

On Tue, Aug 01, 2006 at 01:29:00PM -0400, Bill Perez wrote:
Could you show us a copy of the cpu history log (in
~hobbit/data/hist/HOSTNAME.cpu) compared with the notifications log
from ~hobbit/server/logs/notifications.log ?
Here is the hostname.cpu and section from notifications.log for those alerts
this morning:
From /hobbit/data/hist/HOSTNAME.cpu
Tue Aug  1 10:34:30 2006 yellow 1154442870 1200
Tue Aug  1 10:54:30 2006 red 1154444070 600
Tue Aug  1 11:04:30 2006 green 1154444670 299
Tue Aug  1 11:09:29 2006 yellow 1154444969 301
Tue Aug  1 11:14:30 2006 red 1154445270 301
Tue Aug  1 11:19:31 2006 green 1154445571

Tue Aug  1 10:54:30 2006 uswosfad.domain.com.cpu (10.128.40.31) user-b9608d6c1a4c@xymon.invalid[175] 1154444070 200
Tue Aug  1 11:04:30 2006 uswosfad.domain.com.cpu (10.128.40.31) user-b9608d6c1a4c@xymon.invalid[175] 1154444670 200 1800
Tue Aug  1 11:19:30 2006 uswosfad.domain.com.cpu (10.128.40.31) user-b9608d6c1a4c@xymon.invalid[175] 1154445570 200
Tue Aug  1 11:19:42 2006 uswosfad.domain.com.cpu (10.128.40.31) user-b9608d6c1a4c@xymon.invalid[175] 1154445581 200 612
OK, Hobbit thinks the first event begins at 10:34 when the status
goes yellow. Even though this doesn't trigger an alert, it registers
this as the starttime of the event. So when it goes red at 10:54, your
10 minute delay has already elapsed, and you get an immediate alert.
Then when it goes green at 11:04 you of course get a recovery notice.

Same thing when the goes yellow again at 11:09. No alert is sent, but
this time is registered as the start of the event. So at 11:14 when it
goes red you do not get an alert (11:09->11:14 is only 5 minutes), but
you do get the alert at 11:19:30 - and when it goes green at 11:19:31
it sends out a "recovered" message.

What time should Hobbit consider the start-of-event time?  Some prefer 
the current arrangement where it uses the time it goes non-green; others 
prefer the time it goes to a color which triggers an alert.  I've heard 
arguments both ways.


Regards,
Henrik