cores
list David Gore
It is possible the cores are caused by running two clients on the same host both redirecting to ~hobbitclient/tmp/msg.txt. That would explain my occasional rash of purples followed by greens at least. I am going to work around that. If I remove most of the entries from bin/hobbitclient-osf1.sh will the back-end get annoyed and engage in odd behavior? Or will it simply not send cpu, procs, disk, etc.. to the web page? -- David
list Henrik Størner
▸
On Mon, Aug 15, 2005 at 10:41:42PM +0000, David Gore wrote:
It is possible the cores are caused by running two clients on the same host both redirecting to ~hobbitclient/tmp/msg.txt. That would explain my occasional rash of purples followed by greens at least. I am going to work around that.
I dont think so. It is your server modules that are crashing, so what happens on the client shouldn't matter at all. Unfortunately, the backtrace you've sent from the core-files doesn't reveal much of where the crash happens - which indicates that there's something that thrashes the stack and causes the crash. To begin with, I'd like you to add "--debug" to the hobbitd command line. This will cause a lot of output to go into your hobbitd.log file, the interesting bits obviously is what happens when it crashes. I'd like to see the full log file, though - I'll e-mail you details of where you can upload it since it's probably too large for e-mail.
▸
If I remove most of the entries from bin/hobbitclient-osf1.sh will the back-end get annoyed and engage in odd behavior? Or will it simply not send cpu, procs, disk, etc.. to the web page?
It should simply stop sending the messages it doesn't have data for. Regards, Henrik
list Wes Neal
Below is all I have in my hobbit-alerts.cfg file currently. I first tested
this with no macro at all and I got the emails fine, but now when trying to
build macros it does not send the mail.
# PAGER GROUPS
$nocsupp = user-337afe88c8b7@xymon.invalid,user-5944348b568b@xymon.invalid
# HOST GROUPS
$testing = tardis,nocsunray04,weslap
HOST=$testing
MAIL $nocsupp REPEAT=10 RECOVERED
Thanks
Wes
list Peter Welter
Avoid using spaces.
$nocsupp = user-337afe88c8b7@xymon.invalid,user-5944348b568b@xymon.invalid
$nocsupp=user-337afe88c8b7@xymon.invalid,user-5944348b568b@xymon.invalid
$testing = tardis,nocsunray04,weslap
Dito Peter 2005/8/16, Wes Neal <user-4f272af8a740@xymon.invalid>:
▸
Below is all I have in my hobbit-alerts.cfg file currently. I first tested
this with no macro at all and I got the emails fine, but now when trying to
build macros it does not send the mail.
# PAGER GROUPS
$nocsupp = user-337afe88c8b7@xymon.invalid,user-5944348b568b@xymon.invalid
# HOST GROUPS
$testing = tardis,nocsunray04,weslap
HOST=$testing
MAIL $nocsupp REPEAT=10 RECOVERED
Thanks
Wes
list Humberto Cabrera
Hi everyone,, in: $CBDC_EMAIL=MAIL $EMAIL REPEAT=10M TIME=W:0000:2359,6:0000:2359,0:0000:0100,0:0430:2359 $CBDC_CELL=MAIL $CELL FORMAT=sms REPEAT=20M TIME=W:0000:2359,6:0000:2359,0:0000:0100,0:0430:2359 $CBTS1_EMAIL=MAIL $EMAIL REPEAT=10M TIME=W:0000:2359,6:0000:2359,0:0000:2000,0:2025:2359 $CBTS1_CELL=MAIL $CELL FORMAT=sms REPEAT=20M TIME=W:0000:2359,6:0000:2359,0:0000:2000,0:2025:2359 $CBWS_CELL=MAIL $CELL FORMAT=sms REPEAT=20M TIME=W:0000:0900,W:1730:2359,60:0000:2359 $RPPRO_EMAIL=MAIL $EMAIL REPEAT=10M TIME=*:0000:0305,*:0310:2130,*:2200:2359 $RPPRO_CELL=MAIL $CELL FORMAT=sms REPEAT=20M TIME=*:0000:0305,*:0310:2130,*:2200:2359 $WEB_CELL=MAIL $CELL FORMAT=sms REPEAT=20M TIME=*:0000:0200,*:0500:2359 Could someone tell me what the W stands for? Im assuming Weekends? Also for example in: 6:0000:2359,0:0000:0100,0:0430:2359 What does the 6: stanbd for ?? Or the 0: ?? Any help would be greatly appreciated. Thanks! Humberto Cabrera Systems Administrator Cosabella
list Larry Barber
'W' stands for weekdays, not weekends. 0 and 6 are days of the week, Sunday and Saturday, respectively. Thanks, Larry Barber
▸
On 1/23/06, Humberto Cabrera <user-fcf745c22bfc@xymon.invalid> wrote:Hi everyone,, in: $CBDC_EMAIL=MAIL $EMAIL REPEAT=10M TIME=W:0000:2359,6:0000:2359,0:0000:0100,0:0430:2359 $CBDC_CELL=MAIL $CELL FORMAT=sms REPEAT=20M TIME=W:0000:2359,6:0000:2359,0:0000:0100,0:0430:2359 $CBTS1_EMAIL=MAIL $EMAIL REPEAT=10M TIME=W:0000:2359,6:0000:2359,0:0000:2000,0:2025:2359 $CBTS1_CELL=MAIL $CELL FORMAT=sms REPEAT=20M TIME=W:0000:2359,6:0000:2359,0:0000:2000,0:2025:2359 $CBWS_CELL=MAIL $CELL FORMAT=sms REPEAT=20M TIME=W:0000:0900,W:1730:2359,60:0000:2359 $RPPRO_EMAIL=MAIL $EMAIL REPEAT=10M TIME=*:0000:0305,*:0310:2130,*:2200:2359 $RPPRO_CELL=MAIL $CELL FORMAT=sms REPEAT=20M TIME=*:0000:0305,*:0310:2130,*:2200:2359 $WEB_CELL=MAIL $CELL FORMAT=sms REPEAT=20M TIME=*:0000:0200,*:0500:2359 Could someone tell me what the W stands for? Im assuming Weekends? Also for example in: 6:0000:2359,0:0000:0100,0:0430:2359 What does the 6: stanbd for ?? Or the 0: ?? Any help would be greatly appreciated. Thanks! Humberto Cabrera Systems Administrator Cosabella
list Humberto Cabrera
Thank you very much, by any chance can you point me to the documentation that contains this settings,. ive read the setting up alerts doc on the hobbit site and nothing regarding those parameters was mentioned. once again, thank you.
▸
-----Original Message-----
From: Larry Barber [mailto:user-6ef9c2864140@xymon.invalid]
Sent: Monday, January 23, 2006 1:18 PM
To: user-ae9b8668bcde@xymon.invalid
Subject: Re: [hobbit] Hobbit-alerts.cfg question
'W' stands for weekdays, not weekends. 0 and 6 are days of the
week, Sunday and Saturday, respectively.
Thanks,
Larry Barber
On 1/23/06, Humberto Cabrera <user-fcf745c22bfc@xymon.invalid> wrote:
Hi everyone,, in:
$CBDC_EMAIL=MAIL $EMAIL REPEAT=10M
TIME=W:0000:2359,6:0000:2359,0:0000:0100,0:0430:2359
$CBDC_CELL=MAIL $CELL FORMAT=sms REPEAT=20M
TIME=W:0000:2359,6:0000:2359,0:0000:0100,0:0430:2359
$CBTS1_EMAIL=MAIL $EMAIL REPEAT=10M
TIME=W:0000:2359,6:0000:2359,0:0000:2000,0:2025:2359
$CBTS1_CELL=MAIL $CELL FORMAT=sms REPEAT=20M
TIME=W:0000:2359,6:0000:2359,0:0000:2000,0:2025:2359
$CBWS_CELL=MAIL $CELL FORMAT=sms REPEAT=20M
TIME=W:0000:0900,W:1730:2359,60:0000:2359
$RPPRO_EMAIL=MAIL $EMAIL REPEAT=10M
TIME=*:0000:0305,*:0310:2130,*:2200:2359
$RPPRO_CELL=MAIL $CELL FORMAT=sms REPEAT=20M
TIME=*:0000:0305,*:0310:2130,*:2200:2359
$WEB_CELL=MAIL $CELL FORMAT=sms REPEAT=20M
TIME=*:0000:0200,*:0500:2359
Could someone tell me what the W stands for? Im
assuming Weekends?
Also for example in:
6:0000:2359,0:0000:0100,0:0430:2359
What does the 6: stanbd for ?? Or the 0: ?? Any help
would be greatly
appreciated.
Thanks!
Humberto Cabrera
Systems Administrator
Cosabella
list Larry Barber
The hobbit-alerts.cfg man page refers you to the bb-hosts man page for the TIME specification, in particular the section in bb-hosts dealing with DOWNTIME.
▸
Thanks,
Larry Barber
On 1/23/06, Humberto Cabrera <user-fcf745c22bfc@xymon.invalid> wrote:Thank you very much, by any chance can you point me to the documentation that contains this settings,. ive read the setting up alerts doc on the hobbit site and nothing regarding those parameters was mentioned. once again, thank you. -----Original Message----- *From:* Larry Barber [mailto:user-6ef9c2864140@xymon.invalid] *Sent:* Monday, January 23, 2006 1:18 PM *To:* user-ae9b8668bcde@xymon.invalid *Subject:* Re: [hobbit] Hobbit-alerts.cfg question 'W' stands for weekdays, not weekends. 0 and 6 are days of the week, Sunday and Saturday, respectively. Thanks, Larry Barber On 1/23/06, Humberto Cabrera <user-fcf745c22bfc@xymon.invalid> wrote:Hi everyone,, in: $CBDC_EMAIL=MAIL $EMAIL REPEAT=10M TIME=W:0000:2359,6:0000:2359,0:0000:0100,0:0430:2359 $CBDC_CELL=MAIL $CELL FORMAT=sms REPEAT=20M TIME=W:0000:2359,6:0000:2359,0:0000:0100,0:0430:2359 $CBTS1_EMAIL=MAIL $EMAIL REPEAT=10M TIME=W:0000:2359,6:0000:2359,0:0000:2000,0:2025:2359 $CBTS1_CELL=MAIL $CELL FORMAT=sms REPEAT=20M TIME=W:0000:2359,6:0000:2359,0:0000:2000,0:2025:2359 $CBWS_CELL=MAIL $CELL FORMAT=sms REPEAT=20M TIME=W:0000:0900,W:1730:2359,60:0000:2359 $RPPRO_EMAIL=MAIL $EMAIL REPEAT=10M TIME=*:0000:0305,*:0310:2130,*:2200:2359 $RPPRO_CELL=MAIL $CELL FORMAT=sms REPEAT=20M TIME=*:0000:0305,*:0310:2130,*:2200:2359 $WEB_CELL=MAIL $CELL FORMAT=sms REPEAT=20M TIME=*:0000:0200,*:0500:2359 Could someone tell me what the W stands for? Im assuming Weekends? Also for example in: 6:0000:2359,0:0000:0100,0:0430:2359 What does the 6: stanbd for ?? Or the 0: ?? Any help would be greatly appreciated. Thanks! Humberto Cabrera Systems Administrator Cosabella
list Colin Coe
Hi all
The alerting is starting to take shape but I've a question regarding
how the alerting works. If I have a stanza similar to the following,
how is it evaluated? Once for all hosts, or for one host at a time?
---
HOST=%.*
# Proliant tests
MAIL user-4c524593359c@xymon.invalid SERVICE=proliant FORMAT=SMS REPEAT=1440m
MAIL user-4c524593359c@xymon.invalid SERVICE=proliant FORMAT=SMS RECOVERED
# conn where status is RED
MAIL user-4c524593359c@xymon.invalid COLOR=red SERVICE=conn EXPAGE=dev REPEAT=1440m
MAIL user-4c524593359c@xymon.invalid COLOR=red SERVICE=conn EXPAGE=dev RECOVERED
# conn where status is RED (dev/test)
MAIL user-65aef167d5bd@xymon.invalid COLOR=red SERVICE=conn PAGE=dev REPEAT=1440m
MAIL user-65aef167d5bd@xymon.invalid COLOR=red SERVICE=conn PAGE=dev RECOVERED
# cpu,disk,memory where status is RED
MAIL user-4c524593359c@xymon.invalid COLOR=red SERVICE=cpu,disk,memory
EXPAGE=dev REPEAT=1440m
MAIL user-4c524593359c@xymon.invalid COLOR=red SERVICE=cpu,disk,memory
EXPAGE=dev RECOVERED
# Dev servers
MAIL user-65aef167d5bd@xymon.invalid COLOR=red SERVICE=cpu,disk,memory
PAGE=dev REPEAT=1440m
MAIL user-65aef167d5bd@xymon.invalid COLOR=red SERVICE=cpu,disk,memory
PAGE=dev RECOVERED
# Non-dev status YELLOW
MAIL user-65aef167d5bd@xymon.invalid COLOR=yellow
SERVICE=cpu,disk,memory REPEAT=1440m DURATION>30m
MAIL user-65aef167d5bd@xymon.invalid COLOR=yellow
SERVICE=cpu,disk,memory RECOVERED
---
Also, I've noticed that when a fault occurs I get two emails (or sms')
and another when the fault is rectified. I'm thinking this is because
of the 'RECOVERED' line but i thought this would only trigger when the
fault goes. Have I misunderstood?
Thanks
CC
--
RHCE#805007969328369
list Vernon Everett
Hi Colin
One line per alert, with RECOVERED on the end.
Change it to something like this.
MAIL user-4c524593359c@xymon.invalid SERVICE=proliant FORMAT=SMS REPEAT=1440m
RECOVERED
Cheers
Vernon
▸
On Fri, Oct 8, 2010 at 10:40 AM, Colin Coe <user-5b250cd7a540@xymon.invalid> wrote:
Hi all
The alerting is starting to take shape but I've a question regarding
how the alerting works. If I have a stanza similar to the following,
how is it evaluated? Once for all hosts, or for one host at a time?
---
HOST=%.*
# Proliant tests
MAIL user-4c524593359c@xymon.invalid SERVICE=proliant FORMAT=SMS REPEAT=1440m
MAIL user-4c524593359c@xymon.invalid SERVICE=proliant FORMAT=SMS RECOVERED
# conn where status is RED
MAIL user-4c524593359c@xymon.invalid COLOR=red SERVICE=conn EXPAGE=dev
REPEAT=1440m
MAIL user-4c524593359c@xymon.invalid COLOR=red SERVICE=conn EXPAGE=dev
RECOVERED
# conn where status is RED (dev/test)
MAIL user-65aef167d5bd@xymon.invalid COLOR=red SERVICE=conn PAGE=dev
REPEAT=1440m
MAIL user-65aef167d5bd@xymon.invalid COLOR=red SERVICE=conn PAGE=dev
RECOVERED
# cpu,disk,memory where status is RED
MAIL user-4c524593359c@xymon.invalid COLOR=red SERVICE=cpu,disk,memory
EXPAGE=dev REPEAT=1440m
MAIL user-4c524593359c@xymon.invalid COLOR=red SERVICE=cpu,disk,memory
EXPAGE=dev RECOVERED
# Dev servers
MAIL user-65aef167d5bd@xymon.invalid COLOR=red SERVICE=cpu,disk,memory
PAGE=dev REPEAT=1440m
MAIL user-65aef167d5bd@xymon.invalid COLOR=red SERVICE=cpu,disk,memory
PAGE=dev RECOVERED
# Non-dev status YELLOW
MAIL user-65aef167d5bd@xymon.invalid COLOR=yellow
SERVICE=cpu,disk,memory REPEAT=1440m DURATION>30m
MAIL user-65aef167d5bd@xymon.invalid COLOR=yellow
SERVICE=cpu,disk,memory RECOVERED
---
Also, I've noticed that when a fault occurs I get two emails (or sms')
and another when the fault is rectified. I'm thinking this is because
of the 'RECOVERED' line but i thought this would only trigger when the
fault goes. Have I misunderstood?
Thanks
CC
--
RHCE#805007969328369
list Colin Coe
Cool. Thanks Vernon. On Fri, Oct 8, 2010 at 11:12 AM, Vernon Everett
▸
<user-b3f8dacb72c8@xymon.invalid> wrote:Hi Colin One line per alert, with RECOVERED on the end. Change it to something like this. MAIL user-4c524593359c@xymon.invalid SERVICE=proliant FORMAT=SMS REPEAT=1440m RECOVERED Cheers Vernon On Fri, Oct 8, 2010 at 10:40 AM, Colin Coe <user-5b250cd7a540@xymon.invalid> wrote:Hi all The alerting is starting to take shape but I've a question regarding how the alerting works. If I have a stanza similar to the following, how is it evaluated? Once for all hosts, or for one host at a time? --- HOST=%.* # Proliant tests MAIL user-4c524593359c@xymon.invalid SERVICE=proliant FORMAT=SMS REPEAT=1440m MAIL user-4c524593359c@xymon.invalid SERVICE=proliant FORMAT=SMS RECOVERED # conn where status is RED MAIL user-4c524593359c@xymon.invalid COLOR=red SERVICE=conn EXPAGE=dev REPEAT=1440m MAIL user-4c524593359c@xymon.invalid COLOR=red SERVICE=conn EXPAGE=dev RECOVERED # conn where status is RED (dev/test) MAIL user-65aef167d5bd@xymon.invalid COLOR=red SERVICE=conn PAGE=dev REPEAT=1440m MAIL user-65aef167d5bd@xymon.invalid COLOR=red SERVICE=conn PAGE=dev RECOVERED # cpu,disk,memory where status is RED MAIL user-4c524593359c@xymon.invalid COLOR=red SERVICE=cpu,disk,memory EXPAGE=dev REPEAT=1440m MAIL user-4c524593359c@xymon.invalid COLOR=red SERVICE=cpu,disk,memory EXPAGE=dev RECOVERED # Dev servers MAIL user-65aef167d5bd@xymon.invalid COLOR=red SERVICE=cpu,disk,memory PAGE=dev REPEAT=1440m MAIL user-65aef167d5bd@xymon.invalid COLOR=red SERVICE=cpu,disk,memory PAGE=dev RECOVERED # Non-dev status YELLOW MAIL user-65aef167d5bd@xymon.invalid COLOR=yellow SERVICE=cpu,disk,memory REPEAT=1440m DURATION>30m MAIL user-65aef167d5bd@xymon.invalid COLOR=yellow SERVICE=cpu,disk,memory RECOVERED --- Also, I've noticed that when a fault occurs I get two emails (or sms') and another when the fault is rectified. I'm thinking this is because of the 'RECOVERED' line but i thought this would only trigger when the fault goes. Have I misunderstood? Thanks CC -- RHCE#805007969328369
--
RHCE#805007969328369
list Henrik Størner
▸
In <AANLkTi=user-dc67722108c1@xymon.invalid> Colin Coe <user-5b250cd7a540@xymon.invalid> writes:
The alerting is starting to take shape but I've a question regarding how the alerting works. If I have a stanza similar to the following, how is it evaluated? Once for all hosts, or for one host at a time?
I understand your curiosity, but does it really matter how ? But it is evaluated whenever a potential alert may be generated, based on the host/service combination, time-of-day and all the other criteria. Think of it as a set of rules, and each time there something red or yellow, hobbitd_alert looks at this set of rules and finds those actions that match (if any).
▸
HOST=%.*
# Proliant tests
MAIL user-4c524593359c@xymon.invalid SERVICE=proliant FORMAT=SMS REPEAT=1440m
MAIL user-4c524593359c@xymon.invalid SERVICE=proliant FORMAT=SMS RECOVEREDAlso, I've noticed that when a fault occurs I get two emails (or sms') and another when the fault is rectified. I'm thinking this is because of the 'RECOVERED' line but i thought this would only trigger when the fault goes. Have I misunderstood?
I think you have. Your configuration sets up two alerting actions, but
both of them send mail to the same recipient. That's why you get two
messages. What you want to do is simpler:
HOST=%.*
# Proliant tests
MAIL user-4c524593359c@xymon.invalid SERVICE=proliant FORMAT=SMS REPEAT=1440m RECOVERED
This will give you one message when the service goes red or yellow, and
one when it recovers. "RECOVERED" is an "add-on" to the normal alert,
since you probably would like to know not only when something is fixed,
but also when it broke.
Regards,
Henrik