Henrik,
Thank you so much for replying. I caused a yellow alarm for procs on host rsoimpm1, I am expecting the rule to fire after 15 minutes. Here is what I see from the log file in more detail:
005-02-01 15:17:29 hobbitd_alert: Got message 37 @@page#37|1107271049.602362|166.34.57.23
9|rsoimpm1|procs|166.34.57.239|1107272849|yellow|green|1107271049|CAY/pmservers|947420
2005-02-01 15:17:29 Got page message from rsoimpm1:procs
2005-02-01 15:17:29 Alert status changed from 0 to 1
2005-02-01 15:17:29 criteriamatch rsoimpm1:procs %.*:(NULL):(NULL)
2005-02-01 15:17:29 pcre_exec returned 1
2005-02-01 15:17:29 Checking explicit color setting 10000000020 against 4 gives 1
2005-02-01 15:17:29 Found a first matching rule
2005-02-01 15:17:29 criteriamatch rsoimpm1:procs (NULL):(NULL):(NULL)
2005-02-01 15:17:29 event start: 1107271049, failed minduration 0<900
2005-02-01 15:17:29 criteriamatch rsoimpm1:procs (NULL):(NULL):(NULL)
2005-02-01 15:17:29 event start: 1107271049, failed minduration 0<39225600
2005-02-01 15:17:29 criteriamatch rsoimpm1:procs (NULL):(NULL):(NULL)
2005-02-01 15:17:29 Checking explicit color setting 10000000040 against 4 gives 0
2005-02-01 15:17:29 No more secondary matching rule
2005-02-01 15:17:29 1 alerts to go
2005-02-01 15:17:29 Compiling regex .*
2005-02-01 15:17:29 criteriamatch rsoimpm1:procs %.*:(NULL):(NULL)
2005-02-01 15:17:29 pcre_exec returned 1
2005-02-01 15:17:29 Checking explicit color setting 10000000020 against 4 gives 1
2005-02-01 15:17:29 Found a first matching rule
2005-02-01 15:17:29 criteriamatch rsoimpm1:procs (NULL):(NULL):(NULL)
2005-02-01 15:17:29 event start: 1107271049, failed minduration 0<900
2005-02-01 15:17:29 criteriamatch rsoimpm1:procs (NULL):(NULL):(NULL)
2005-02-01 15:17:29 event start: 1107271049, failed minduration 0<39225600
2005-02-01 15:17:29 criteriamatch rsoimpm1:procs (NULL):(NULL):(NULL)
2005-02-01 15:17:29 send_alert rsoimpm1:procs state 0
2005-02-01 15:17:29 Checking explicit color setting 10000000040 against 4 gives 0
2005-02-01 15:17:29 criteriamatch rsoimpm1:procs %.*:(NULL):(NULL)
2005-02-01 15:17:29 No more secondary matching rule
2005-02-01 15:17:29 pcre_exec returned 1
2005-02-01 15:17:29 Checking explicit color setting 10000000020 against 4 gives 1
2005-02-01 15:17:29 Found a first matching rule
2005-02-01 15:17:29 criteriamatch rsoimpm1:procs (NULL):(NULL):(NULL)
2005-02-01 15:17:29 event start: 1107271049, failed minduration 0<900
2005-02-01 15:17:29 criteriamatch rsoimpm1:procs (NULL):(NULL):(NULL)
2005-02-01 15:17:29 event start: 1107271049, failed minduration 0<39225600
2005-02-01 15:17:29 criteriamatch rsoimpm1:procs (NULL):(NULL):(NULL)
2005-02-01 15:17:29 Checking explicit color setting 10000000040 against 4 gives 0
2005-02-01 15:17:29 No more secondary matching rule
I caused a yellow alarm at 15:17, so far OK. Alert status changed, criteria match, regex match, color match, found rule, checking minduration, which fails, not less than 15 minutes. Sorry, I did add to the debug print statement in the source code.
2005-02-01 15:22:29 hobbitd_alert: Got message 58 @@page#58|1107271349.301483|166.34.57.23
9|rsoimpm1|procs|166.34.57.239|1107273149|yellow|yellow|1107271049|CAY/pmservers|947420
2005-02-01 15:22:29 Got page message from rsoimpm1:procs
2005-02-01 15:22:29 0 alerts to go
2005-02-01 15:27:29 hobbitd_alert: Got message 79 @@page#79|1107271649.155212|166.34.57.23
9|rsoimpm1|procs|166.34.57.239|1107273449|yellow|yellow|1107271049|CAY/pmservers|947420
2005-02-01 15:27:29 Got page message from rsoimpm1:procs
2005-02-01 15:27:29 0 alerts to go
2005-02-01 15:32:28 hobbitd_alert: Got message 101 @@page#101|1107271948.980583|166.34.57.
239|rsoimpm1|procs|166.34.57.239|1107273748|yellow|yellow|1107271049|CAY/pmservers|947420
2005-02-01 15:32:28 Got page message from rsoimpm1:procs
2005-02-01 15:32:28 0 alerts to go
2005-02-01 15:37:28 hobbitd_alert: Got message 123 @@page#123|1107272248.884069|166.34.57.
239|rsoimpm1|procs|166.34.57.239|1107274048|yellow|yellow|1107271049|CAY/pmservers|947420
2005-02-01 15:37:28 Got page message from rsoimpm1:procs
2005-02-01 15:37:28 0 alerts to go
So it's like nothing happens afterwards? Hopefully, I got all the relevant parts of the log file. I didn't want the posting to long. Any ideas?
~David Gore
Henrik Stoerner wrote:
On Tue, Feb 01, 2005 at 01:02:58AM +0000, David Gore wrote:
As you can see from the out put below a DURATION of '15m' translates to 653760.
I'll look into that
Either we have something configured wrong or DURATION is broken?
HOST=% COLOR=yellow
MAIL user-66f2c06d9d16@xymon.invalid REPEAT=8h DURATION>15
MAIL user-bce0fa03bec0@xymon.invalid REPEAT=8h DURATION>15m
"HOST=%" is definitely wrong. "HOST=%.*" is what you want.
Henrik