nightly reboots
list Gavin Leonard
All,
I am having an issue where my hobbit server thinks that every server it monitors has been rebooted, so I get blasted with sms messages when this happens. And none of the servers have actually rebooted nor has there been any network outages.. ideas?thoughts?
Gavin Leonard
[cid:image001.gif at 01C97562.5DF2D550]
Director, Systems-Network Engineering
T
XXX-XXX-XXXX
F
XXX-XXX-XXXX
E
user-d65663809eb4@xymon.invalid<mailto:user-d65663809eb4@xymon.invalid>
Research | Marketing | Sales Generation
www.progrexion.com<http://www.progrexion.com/>
This email and its contents are confidential. If you are not the intended recipient, delete this email and do not use or disclose the information within this email or its attachments. Thank you.
list Josh Luthman
I have had the problem where the conn test goes bad for everything (not every host, just groups based on bb-hosts) since I installed it at the office. No idea why :( What I do is delay the red sms alerts by a few minutes as it is red for only a few seconds, sometimes a minute.
▸
On 1/13/09, Gavin Leonard <user-d65663809eb4@xymon.invalid> wrote:All,
I am having an issue where my hobbit server thinks that
every server it monitors has been rebooted, so I get blasted with sms
messages when this happens. And none of the servers have actually rebooted
nor has there been any network outages.. ideas?thoughts?
Gavin Leonard
[cid:image001.gif at 01C97562.5DF2D550]
Director, Systems-Network Engineering
T
XXX-XXX-XXXX
F
XXX-XXX-XXXX
E
user-d65663809eb4@xymon.invalid<mailto:user-d65663809eb4@xymon.invalid>
Research | Marketing | Sales Generation
www.progrexion.com<http://www.progrexion.com/>;
This email and its contents are confidential. If you are not the intended
recipient, delete this email and do not use or disclose the information
within this email or its attachments. Thank you.
--
Josh Luthman
Office: XXX-XXX-XXXX
Direct: XXX-XXX-XXXX
XXXX Wayne St
Suite XXXX
Troy, OH XXXXX
Those who don't understand UNIX are condemned to reinvent it, poorly.
--- Henry Spencer
list Gavin Leonard
Ok.. so how do you delay the red alerts? I am wondering if I am just over loading this system... I may need to build another bb server so I can split up the work load a bit.. thanks in advance!! -Gavin
▸
-----Original Message-----
From: Josh Luthman [mailto:user-4c45a83f15cb@xymon.invalid]
Sent: Tuesday, January 13, 2009 9:49 AM
To: user-ae9b8668bcde@xymon.invalid
Subject: Re: [hobbit] nightly reboots
I have had the problem where the conn test goes bad for everything
(not every host, just groups based on bb-hosts) since I installed it
at the office. No idea why :(
What I do is delay the red sms alerts by a few minutes as it is red
for only a few seconds, sometimes a minute.
On 1/13/09, Gavin Leonard <user-d65663809eb4@xymon.invalid> wrote:All,
I am having an issue where my hobbit server thinks that
every server it monitors has been rebooted, so I get blasted with sms
messages when this happens. And none of the servers have actually rebooted
nor has there been any network outages.. ideas?thoughts?
Gavin Leonard
[cid:image001.gif at 01C97562.5DF2D550]
Director, Systems-Network Engineering
T
XXX-XXX-XXXX
F
XXX-XXX-XXXX
E
user-d65663809eb4@xymon.invalid<mailto:user-d65663809eb4@xymon.invalid>
Research | Marketing | Sales Generation
www.progrexion.com<http://www.progrexion.com/>;
This email and its contents are confidential. If you are not the intended
recipient, delete this email and do not use or disclose the information
within this email or its attachments. Thank you.
-- Josh Luthman Office: XXX-XXX-XXXX Direct: XXX-XXX-XXXX XXXX Wayne St Suite XXXX Troy, OH XXXXX Those who don't understand UNIX are condemned to reinvent it, poorly. --- Henry Spencer
list Josh Luthman
HOST=%.*\.imaginenetworksllc\.com MAIL user-2cb415b7efbb@xymon.invalid COLOR=RED DURATION>2m REPEAT=60 RECOVERED FORMAT=SMS Like that
▸
Josh Luthman
Office: XXX-XXX-XXXX
Direct: XXX-XXX-XXXX
XXXX Wayne St
Suite XXXX
Troy, OH XXXXX
Those who don't understand UNIX are condemned to reinvent it, poorly.
--- Henry Spencer
▸
On Tue, Jan 13, 2009 at 12:25 PM, Gavin Leonard <user-d65663809eb4@xymon.invalid>wrote:
Ok.. so how do you delay the red alerts? I am wondering if I am just over loading this system... I may need to build another bb server so I can split up the work load a bit.. thanks in advance!! -Gavin -----Original Message----- From: Josh Luthman [mailto:user-4c45a83f15cb@xymon.invalid] Sent: Tuesday, January 13, 2009 9:49 AM To: user-ae9b8668bcde@xymon.invalid Subject: Re: [hobbit] nightly reboots I have had the problem where the conn test goes bad for everything (not every host, just groups based on bb-hosts) since I installed it at the office. No idea why :( What I do is delay the red sms alerts by a few minutes as it is red for only a few seconds, sometimes a minute. On 1/13/09, Gavin Leonard <user-d65663809eb4@xymon.invalid> wrote:All, I am having an issue where my hobbit server thinks that every server it monitors has been rebooted, so I get blasted with sms messages when this happens. And none of the servers have actually rebooted nor has there been any network outages.. ideas?thoughts? Gavin Leonard [cid:image001.gif at 01C97562.5DF2D550] Director, Systems-Network Engineering T XXX-XXX-XXXX F XXX-XXX-XXXX E user-d65663809eb4@xymon.invalid<mailto:user-d65663809eb4@xymon.invalid> Research | Marketing | Sales Generation www.progrexion.com<http://www.progrexion.com/>; This email and its contents are confidential. If you are not the intended recipient, delete this email and do not use or disclose the information within this email or its attachments. Thank you.-- Josh Luthman Office: XXX-XXX-XXXX Direct: XXX-XXX-XXXX XXXX Wayne St Suite XXXX Troy, OH XXXXX Those who don't understand UNIX are condemned to reinvent it, poorly. --- Henry Spencer
list Ralph Mitchell
How much work is the server doing?? The company that just laid me off has an old single-cpu, 733MHz DL380 running RedHat 7.2. It runs a lot of bash scripts out of cron to fetch and check web pages, with the results being reported back to the same machine. Last time I was able to see it, there were over 400 bb-hosts entries and over 2500 reports. It has a fairly constant load average of around 5 or 6, spiking to maybe 10 or 11 whenever the planets align and a lot of stuff happens simultaneously. As soon as they can figure out how to replace it, Hobbit'll be shutdown, as it's not one of the officially blessed monitoring systems. However, even the folks in their Integration Labs admit they have nothing that can do quite what I've done with Hobbit, so I imagine they'll end up telling their customers the monitoring is being downgraded. I'd love to be a fly on the wall for *those* conversations... :) Ralph Mitchell
▸
On Tue, Jan 13, 2009 at 11:25 AM, Gavin Leonard <user-d65663809eb4@xymon.invalid>wrote:
Ok.. so how do you delay the red alerts? I am wondering if I am just over loading this system... I may need to build another bb server so I can split up the work load a bit.. thanks in advance!! -Gavin -----Original Message----- From: Josh Luthman [mailto:user-4c45a83f15cb@xymon.invalid] Sent: Tuesday, January 13, 2009 9:49 AM To: user-ae9b8668bcde@xymon.invalid Subject: Re: [hobbit] nightly reboots I have had the problem where the conn test goes bad for everything (not every host, just groups based on bb-hosts) since I installed it at the office. No idea why :( What I do is delay the red sms alerts by a few minutes as it is red for only a few seconds, sometimes a minute. On 1/13/09, Gavin Leonard <user-d65663809eb4@xymon.invalid> wrote:All, I am having an issue where my hobbit server thinks that every server it monitors has been rebooted, so I get blasted with sms messages when this happens. And none of the servers have actually rebooted nor has there been any network outages.. ideas?thoughts? Gavin Leonard [cid:image001.gif at 01C97562.5DF2D550] Director, Systems-Network Engineering T XXX-XXX-XXXX F XXX-XXX-XXXX E user-d65663809eb4@xymon.invalid<mailto:user-d65663809eb4@xymon.invalid> Research | Marketing | Sales Generation www.progrexion.com<http://www.progrexion.com/>; This email and its contents are confidential. If you are not the intended recipient, delete this email and do not use or disclose the information within this email or its attachments. Thank you.-- Josh Luthman Office: XXX-XXX-XXXX Direct: XXX-XXX-XXXX XXXX Wayne St Suite XXXX Troy, OH XXXXX Those who don't understand UNIX are condemned to reinvent it, poorly. --- Henry Spencer
list Michael Nemeth
Maybe his NET is overloaded. We run some security scripts at night that hit the networks so heavy we get network problems .
▸
Ralph Mitchell wrote:How much work is the server doing?? The company that just laid me off has an old single-cpu, 733MHz DL380 running RedHat 7.2. It runs a lot of bash scripts out of cron to fetch and check web pages, with the results being reported back to the same machine. Last time I was able to see it, there were over 400 bb-hosts entries and over 2500 reports. It has a fairly constant load average of around 5 or 6, spiking to maybe 10 or 11 whenever the planets align and a lot of stuff happens simultaneously.
As soon as they can figure out how to replace it, Hobbit'll be shutdown, as it's not one of the officially blessed monitoring systems. However, even the folks in their Integration Labs admit they have nothing that can do quite what I've done with Hobbit, so I imagine they'll end up telling their customers the monitoring is being downgraded. I'd love to be a fly on the wall for *those* conversations... :)
Ralph Mitchell
On Tue, Jan 13, 2009 at 11:25 AM, Gavin Leonard <user-d65663809eb4@xymon.invalid <mailto:user-d65663809eb4@xymon.invalid>> wrote:
Ok.. so how do you delay the red alerts? I am wondering if I am
just over loading this system... I may need to build another bb
server so I can split up the work load a bit.. thanks in advance!!
-Gavin
-----Original Message-----
From: Josh Luthman [mailto:user-4c45a83f15cb@xymon.invalid
<mailto:user-4c45a83f15cb@xymon.invalid>]
Sent: Tuesday, January 13, 2009 9:49 AM
To: user-ae9b8668bcde@xymon.invalid <mailto:user-ae9b8668bcde@xymon.invalid>
▸
Subject: Re: [hobbit] nightly reboots I have had the problem where the conn test goes bad for everything (not every host, just groups based on bb-hosts) since I installed it at the office. No idea why :( What I do is delay the red sms alerts by a few minutes as it is red for only a few seconds, sometimes a minute. On 1/13/09, Gavin Leonard <user-d65663809eb4@xymon.invalid <mailto:user-d65663809eb4@xymon.invalid>> wrote:All, I am having an issue where my hobbit server thinks that every server it monitors has been rebooted, so I get blasted with sms messages when this happens. And none of the servers have actually rebooted nor has there been any network outages.. ideas?thoughts? Gavin Leonard [cid:image001.gif at 01C97562.5DF2D550] Director, Systems-Network Engineering T XXX-XXX-XXXX F XXX-XXX-XXXX E
user-d65663809eb4@xymon.invalid Research | Marketing | Sales Generation www.progrexion.com<http://www.progrexion.com><http://www.progrexion.com/>;
▸
This email and its contents are confidential. If you are not the intended recipient, delete this email and do not use or disclose the information within this email or its attachments. Thank you.-- Josh Luthman Office: XXX-XXX-XXXX Direct: XXX-XXX-XXXX XXXX Wayne St Suite XXXX Troy, OH XXXXX Those who don't understand UNIX are condemned to reinvent it, poorly. --- Henry Spencer
list Gavin Leonard
Wow.. well not as much as that for sure, it's a apple G5 server with a couple gig of ram running yellowdog <--thank heavens for yellowdog or there would be nothing better to do with mac servers but use them for door stops. :) -Gavin
▸
From: Ralph Mitchell [mailto:user-00a5e44c48c0@xymon.invalid]
Sent: Tuesday, January 13, 2009 11:55 AM
To: user-ae9b8668bcde@xymon.invalid
Subject: Re: [hobbit] nightly reboots
How much work is the server doing?? The company that just laid me off has an old single-cpu, 733MHz DL380 running RedHat 7.2. It runs a lot of bash scripts out of cron to fetch and check web pages, with the results being reported back to the same machine. Last time I was able to see it, there were over 400 bb-hosts entries and over 2500 reports. It has a fairly constant load average of around 5 or 6, spiking to maybe 10 or 11 whenever the planets align and a lot of stuff happens simultaneously.
As soon as they can figure out how to replace it, Hobbit'll be shutdown, as it's not one of the officially blessed monitoring systems. However, even the folks in their Integration Labs admit they have nothing that can do quite what I've done with Hobbit, so I imagine they'll end up telling their customers the monitoring is being downgraded. I'd love to be a fly on the wall for *those* conversations... :)
Ralph Mitchell
On Tue, Jan 13, 2009 at 11:25 AM, Gavin Leonard <user-d65663809eb4@xymon.invalid<mailto:user-d65663809eb4@xymon.invalid>> wrote:
Ok.. so how do you delay the red alerts? I am wondering if I am just over loading this system... I may need to build another bb server so I can split up the work load a bit.. thanks in advance!!
-Gavin
-----Original Message-----
From: Josh Luthman [mailto:user-4c45a83f15cb@xymon.invalid<mailto:user-4c45a83f15cb@xymon.invalid>]
Sent: Tuesday, January 13, 2009 9:49 AM
To: user-ae9b8668bcde@xymon.invalid<mailto:user-ae9b8668bcde@xymon.invalid>
Subject: Re: [hobbit] nightly reboots
I have had the problem where the conn test goes bad for everything
(not every host, just groups based on bb-hosts) since I installed it
at the office. No idea why :(
What I do is delay the red sms alerts by a few minutes as it is red
for only a few seconds, sometimes a minute.
On 1/13/09, Gavin Leonard <user-d65663809eb4@xymon.invalid<mailto:user-d65663809eb4@xymon.invalid>> wrote:All,
I am having an issue where my hobbit server thinks that
every server it monitors has been rebooted, so I get blasted with sms
messages when this happens. And none of the servers have actually rebooted
nor has there been any network outages.. ideas?thoughts?
Gavin Leonard
[cid:image001.gif at 01C97562.5DF2D550]
Director, Systems-Network Engineering
T
XXX-XXX-XXXX
F
XXX-XXX-XXXX
E
user-d65663809eb4@xymon.invalid<mailto:user-d65663809eb4@xymon.invalid><mailto:user-d65663809eb4@xymon.invalid<mailto:user-d65663809eb4@xymon.invalid>> Research | Marketing | Sales Generation www.progrexion.com<http://www.progrexion.com><http://www.progrexion.com/>;
▸
This email and its contents are confidential. If you are not the intended
recipient, delete this email and do not use or disclose the information
within this email or its attachments. Thank you.
-- Josh Luthman Office: XXX-XXX-XXXX Direct: XXX-XXX-XXXX XXXX Wayne St Suite XXXX Troy, OH XXXXX Those who don't understand UNIX are condemned to reinvent it, poorly. --- Henry Spencer
list Gavin Leonard
Even stranger is that they hit right when the hobbit-alerts file starts sending pages for the days.. 5am bleh...
▸
-Gavin
From: Ralph Mitchell [mailto:user-00a5e44c48c0@xymon.invalid]
Sent: Tuesday, January 13, 2009 11:55 AM
To: user-ae9b8668bcde@xymon.invalid
Subject: Re: [hobbit] nightly reboots
How much work is the server doing?? The company that just laid me off has an old single-cpu, 733MHz DL380 running RedHat 7.2. It runs a lot of bash scripts out of cron to fetch and check web pages, with the results being reported back to the same machine. Last time I was able to see it, there were over 400 bb-hosts entries and over 2500 reports. It has a fairly constant load average of around 5 or 6, spiking to maybe 10 or 11 whenever the planets align and a lot of stuff happens simultaneously.
As soon as they can figure out how to replace it, Hobbit'll be shutdown, as it's not one of the officially blessed monitoring systems. However, even the folks in their Integration Labs admit they have nothing that can do quite what I've done with Hobbit, so I imagine they'll end up telling their customers the monitoring is being downgraded. I'd love to be a fly on the wall for *those* conversations... :)
Ralph Mitchell
On Tue, Jan 13, 2009 at 11:25 AM, Gavin Leonard <user-d65663809eb4@xymon.invalid<mailto:user-d65663809eb4@xymon.invalid>> wrote:
Ok.. so how do you delay the red alerts? I am wondering if I am just over loading this system... I may need to build another bb server so I can split up the work load a bit.. thanks in advance!!
-Gavin
-----Original Message-----
From: Josh Luthman [mailto:user-4c45a83f15cb@xymon.invalid<mailto:user-4c45a83f15cb@xymon.invalid>]
Sent: Tuesday, January 13, 2009 9:49 AM
To: user-ae9b8668bcde@xymon.invalid<mailto:user-ae9b8668bcde@xymon.invalid>
Subject: Re: [hobbit] nightly reboots
I have had the problem where the conn test goes bad for everything
(not every host, just groups based on bb-hosts) since I installed it
at the office. No idea why :(
What I do is delay the red sms alerts by a few minutes as it is red
for only a few seconds, sometimes a minute.
On 1/13/09, Gavin Leonard <user-d65663809eb4@xymon.invalid<mailto:user-d65663809eb4@xymon.invalid>> wrote:All,
I am having an issue where my hobbit server thinks that
every server it monitors has been rebooted, so I get blasted with sms
messages when this happens. And none of the servers have actually rebooted
nor has there been any network outages.. ideas?thoughts?
Gavin Leonard
[cid:image001.gif at 01C97562.5DF2D550]
Director, Systems-Network Engineering
T
XXX-XXX-XXXX
F
XXX-XXX-XXXX
E
user-d65663809eb4@xymon.invalid<mailto:user-d65663809eb4@xymon.invalid><mailto:user-d65663809eb4@xymon.invalid<mailto:user-d65663809eb4@xymon.invalid>>
Research | Marketing | Sales Generation
www.progrexion.com<http://www.progrexion.com><http://www.progrexion.com/>;
This email and its contents are confidential. If you are not the intended
recipient, delete this email and do not use or disclose the information
within this email or its attachments. Thank you.
-- Josh Luthman Office: XXX-XXX-XXXX Direct: XXX-XXX-XXXX XXXX Wayne St Suite XXXX Troy, OH XXXXX Those who don't understand UNIX are condemned to reinvent it, poorly. --- Henry Spencer
list Gavin Leonard
Update to this.. so I stopped getting pages for all my servers supposedly becoming unreachable, now I just get one that states that the bbtest had recovered.. looks like this. Does that shed any more light for those having this same problem? green Tue Jan 27 07:31:00 2009 bbtest-net version 4.2.0 SSL library : OpenSSL 0.9.7f 22 Mar 2005 LDAP library: OpenLDAP 20223 Statistics: Hosts total : 62 Hosts with no tests : 0 Total test count : 64 Status messages : 65 Alert status msgs : 0 Transmissions : 2 DNS statistics: # hostnames resolved : 62 # succesful : 61 # failed : 1 # calls to dnsresolve : 64 TCP test statistics: # TCP tests total : 2 # HTTP tests : 1 # Simple TCP tests : 1 # Connection attempts : 2 # bytes written : 133 # bytes read : 149174 TIME SPENT Event Starttime Duration bbtest-net startup 1233066660.603990 - Service definitions loaded 1233066660.605545 0.001555 Tests loaded 1233066660.617314 0.011769 DNS lookups completed 1233066660.640820 0.023506 Test engine setup completed 1233066660.642001 0.001181 TCP tests completed 1233066660.643672 0.001671 PING test completed (62 hosts) 1233066663.593909 2.950237 PING test results sent 1233066663.594425 0.000516 Test result collection completed 1233066663.594433 0.000008 LDAP test engine setup completed 1233066663.594434 0.000001 LDAP tests executed 1233066663.594436 0.000002 LDAP tests result collection completed 1233066663.594437 0.000001 Test results transmitted 1233066663.594806 0.000369 bbtest-net completed 1233066663.596373 0.001567 TIME TOTAL 2.992383 -Gavin
▸
From: Josh Luthman [mailto:user-4c45a83f15cb@xymon.invalid]
Sent: Tuesday, January 13, 2009 10:31 AM
To: user-ae9b8668bcde@xymon.invalid
Subject: Re: [hobbit] nightly reboots
HOST=%.*\.imaginenetworksllc\.com
MAIL user-2cb415b7efbb@xymon.invalid<mailto:user-2cb415b7efbb@xymon.invalid> COLOR=RED DURATION>2m REPEAT=60 RECOVERED FORMAT=SMS
▸
Like that
Josh Luthman
Office: XXX-XXX-XXXX
Direct: XXX-XXX-XXXX
XXXX Wayne St
Suite XXXX
Troy, OH XXXXX
Those who don't understand UNIX are condemned to reinvent it, poorly.
--- Henry Spencer
On Tue, Jan 13, 2009 at 12:25 PM, Gavin Leonard <user-d65663809eb4@xymon.invalid<mailto:user-d65663809eb4@xymon.invalid>> wrote:
Ok.. so how do you delay the red alerts? I am wondering if I am just over loading this system... I may need to build another bb server so I can split up the work load a bit.. thanks in advance!!
-Gavin
-----Original Message-----
From: Josh Luthman [mailto:user-4c45a83f15cb@xymon.invalid<mailto:user-4c45a83f15cb@xymon.invalid>]
Sent: Tuesday, January 13, 2009 9:49 AM
To: user-ae9b8668bcde@xymon.invalid<mailto:user-ae9b8668bcde@xymon.invalid>
Subject: Re: [hobbit] nightly reboots
I have had the problem where the conn test goes bad for everything
(not every host, just groups based on bb-hosts) since I installed it
at the office. No idea why :(
What I do is delay the red sms alerts by a few minutes as it is red
for only a few seconds, sometimes a minute.
On 1/13/09, Gavin Leonard <user-d65663809eb4@xymon.invalid<mailto:user-d65663809eb4@xymon.invalid>> wrote:All,
I am having an issue where my hobbit server thinks that
every server it monitors has been rebooted, so I get blasted with sms
messages when this happens. And none of the servers have actually rebooted
nor has there been any network outages.. ideas?thoughts?
Gavin Leonard
[cid:image001.gif at 01C97562.5DF2D550]
Director, Systems-Network Engineering
T
XXX-XXX-XXXX
F
XXX-XXX-XXXX
E
user-d65663809eb4@xymon.invalid<mailto:user-d65663809eb4@xymon.invalid><mailto:user-d65663809eb4@xymon.invalid<mailto:user-d65663809eb4@xymon.invalid>>
Research | Marketing | Sales Generation
www.progrexion.com<http://www.progrexion.com><http://www.progrexion.com/>;
This email and its contents are confidential. If you are not the intended
recipient, delete this email and do not use or disclose the information
within this email or its attachments. Thank you.
-- Josh Luthman Office: XXX-XXX-XXXX Direct: XXX-XXX-XXXX XXXX Wayne St Suite XXXX Troy, OH XXXXX Those who don't understand UNIX are condemned to reinvent it, poorly. --- Henry Spencer