Xymon Mailing List Archive search

questions

5 messages in this thread

list Chad Rodriguez · Wed, 21 Jun 2017 21:07:14 +0000 ·
We have no monitoring in place other than solarwinds which monitors heartbeats. Essentially we have a few application servers that are randomly not communicating with the director and were having to reboot them. Seeing if your application would alert on situations like this in an email notification format?

Respectfully,

Chad Rodriguez | Systems Administrator
XXXXX N. XXth Ave., Phoenix, AZ XXXXX
office: XXX-XXX-XXXX | fax: XXX-XXX-XXXX
email - user-d4c92497b647@xymon.invalid<mailto:user-d4c92497b647@xymon.invalid>
[PetSmart_logo_email.jpg]
Upcoming Out-of-Office dates:
June 26th through July 4th
July 21st
list Jeremy Laidman · Thu, 22 Jun 2017 20:22:58 +1000 ·
Chad

Situations like what exactly? When a server is rebooted? Or when a server
stops communicating? Can you explain what symptoms? What is a "director"?
Sorry, I'm not familiar with the Solarwinds product.

Out of the box, Xymon can detect a few different types of communication
issues (eg ping checks, TCP port responses) as well as monitoring logfiles
for messages that indicate trouble. Furthermore, Xymon is highly
extensible, so if you can write a script to perform a test for your
problem, you can turn it into a message for Xymon to display, and
optionally alarm via email or other means.

Cheers
Jeremy
quoted from Chad Rodriguez


On 22 June 2017 at 07:07, Chad Rodriguez <user-d4c92497b647@xymon.invalid> wrote:
We have no monitoring in place other than solarwinds which monitors
heartbeats. Essentially we have a few application servers that are randomly
not communicating with the director and were having to reboot them. Seeing
if your application would alert on situations like this in an email
notification format?


Respectfully,


Chad Rodriguez | Systems Administrator

XXXXX N. XXth Ave., Phoenix, AZ XXXXX

office: XXX-XXX-XXXX | fax: XXX-XXX-XXXX

email – user-d4c92497b647@xymon.invalid

[image: PetSmart_logo_email.jpg]

*Upcoming Out-of-Office dates**:*

*June 26th through July 4th*

*July 21st*

list Chad Rodriguez · Thu, 22 Jun 2017 12:59:35 +0000 ·
Symptom, we open up director and see application servers not communicating at the same time we can ping server by hostname and IP>
quoted from Jeremy Laidman

Respectfully,

Chad Rodriguez | Systems Administrator
XXXXX N. XXth Ave., Phoenix, AZ XXXXX
office: XXX-XXX-XXXX | fax: XXX-XXX-XXXX
email – user-d4c92497b647@xymon.invalid<mailto:user-d4c92497b647@xymon.invalid>
[PetSmart_logo_email.jpg]
Upcoming Out-of-Office dates:
June 26th through July 4th
July 21st

From: Jeremy Laidman [mailto:user-71895fb2e44c@xymon.invalid]
Sent: Thursday, June 22, 2017 3:23 AM
To: Chad Rodriguez <user-d4c92497b647@xymon.invalid>
Cc: xymon at xymon.com
Subject: Re: [Xymon] questions

Chad

Situations like what exactly? When a server is rebooted? Or when a server stops communicating? Can you explain what symptoms? What is a "director"? Sorry, I'm not familiar with the Solarwinds product.

Out of the box, Xymon can detect a few different types of communication issues (eg ping checks, TCP port responses) as well as monitoring logfiles for messages that indicate trouble. Furthermore, Xymon is highly extensible, so if you can write a script to perform a test for your problem, you can turn it into a message for Xymon to display, and optionally alarm via email or other means.

Cheers
Jeremy


On 22 June 2017 at 07:07, Chad Rodriguez <user-d4c92497b647@xymon.invalid<mailto:user-d4c92497b647@xymon.invalid>> wrote:
We have no monitoring in place other than solarwinds which monitors heartbeats. Essentially we have a few application servers that are randomly not communicating with the director and were having to reboot them. Seeing if your application would alert on situations like this in an email notification format?

Respectfully,

Chad Rodriguez | Systems Administrator
XXXXX N. XXth Ave., Phoenix, AZ XXXXX
office: XXX-XXX-XXXX | fax: XXX-XXX-XXXX
email – user-d4c92497b647@xymon.invalid<mailto:user-d4c92497b647@xymon.invalid>
[PetSmart_logo_email.jpg]
Upcoming Out-of-Office dates:
June 26th through July 4th
July 21st
list Jeremy Laidman · Fri, 23 Jun 2017 01:02:10 +1000 ·
Chad

What does the director do? How does it communicate with the servers?

Does the director or the server create a log message when there's a
problem? Xymon can detect and alarm on that.

Does the director connect to the servers on a specific TCP port? If that
port is rejecting a connection, the Xymon server can test for that (every 5
minutes, but can be more often) and alarm on it.

When a server fails, does it stop listening on a particular TCP port? Or
perhaps a process crashes and restarts, causing the connection to fail?
Xymon can test for these and alarm when it detects a missing listening or
established TCP socket, or a missing process.

It's also possible to write a script to have Xymon look at the process
listing "ps" output, and look for a particular process's lifetime, and
alert when it's less than 5 minutes.

One thing to note is that Xymon's probes and processes typically look for
things every 5 minutes. Transient failures that come and go within a few
seconds may not be detected using the standard probes and checks. However,
the frequency of some of these probes can be increased to make it more
likely to catch failures. But also, a custom script can be written to check
the state as often as you need. However for transient faults, it's more
reliable to look for artefacts of a failure (log errors and warnings, short
process lifetime) rather than periodically checking for a successful state.

J
quoted from Chad Rodriguez


On 22 June 2017 at 22:59, Chad Rodriguez <user-d4c92497b647@xymon.invalid> wrote:
Symptom, we open up director and see application servers not communicating
at the same time we can ping server by hostname and IP>


Respectfully,


Chad Rodriguez | Systems Administrator

XXXXX N. XXth Ave., Phoenix, AZ XXXXX

office: XXX-XXX-XXXX | fax: XXX-XXX-XXXX

email – user-d4c92497b647@xymon.invalid

[image: PetSmart_logo_email.jpg]

*Upcoming Out-of-Office dates**:*

*June 26th through July 4th*

*July 21st*


*From:* Jeremy Laidman [mailto:user-71895fb2e44c@xymon.invalid]
*Sent:* Thursday, June 22, 2017 3:23 AM
*To:* Chad Rodriguez <user-d4c92497b647@xymon.invalid>
*Cc:* xymon at xymon.com
*Subject:* Re: [Xymon] questions


Chad


Situations like what exactly? When a server is rebooted? Or when a server
stops communicating? Can you explain what symptoms? What is a "director"?
Sorry, I'm not familiar with the Solarwinds product.


Out of the box, Xymon can detect a few different types of communication
issues (eg ping checks, TCP port responses) as well as monitoring logfiles
for messages that indicate trouble. Furthermore, Xymon is highly
extensible, so if you can write a script to perform a test for your
problem, you can turn it into a message for Xymon to display, and
optionally alarm via email or other means.


Cheers

Jeremy


On 22 June 2017 at 07:07, Chad Rodriguez <user-d4c92497b647@xymon.invalid> wrote:

We have no monitoring in place other than solarwinds which monitors
heartbeats. Essentially we have a few application servers that are randomly
not communicating with the director and were having to reboot them. Seeing
if your application would alert on situations like this in an email
notification format?


Respectfully,


Chad Rodriguez | Systems Administrator

XXXXX N. XXth Ave., Phoenix, AZ XXXXX

office: XXX-XXX-XXXX | fax: XXX-XXX-XXXX

email – user-d4c92497b647@xymon.invalid

[image: PetSmart_logo_email.jpg]

*Upcoming Out-of-Office dates**:*

*June 26th through July 4th*

*July 21st*

list Chad Rodriguez · Thu, 22 Jun 2017 15:12:34 +0000 ·
Disregard, I thought your tool was specific to Xenapp/Citrix based application monitoring. Sorry for the bother, I’ll look elsewhere for a solution.
quoted from Jeremy Laidman

Respectfully,

Chad Rodriguez | Systems Administrator
XXXXX N. XXth Ave., Phoenix, AZ XXXXX
office: XXX-XXX-XXXX | fax: XXX-XXX-XXXX
email – user-d4c92497b647@xymon.invalid<mailto:user-d4c92497b647@xymon.invalid>
[PetSmart_logo_email.jpg]
Upcoming Out-of-Office dates:
June 26th through July 4th
July 21st

From: Jeremy Laidman [mailto:user-71895fb2e44c@xymon.invalid]
Sent: Thursday, June 22, 2017 8:02 AM
To: Chad Rodriguez <user-d4c92497b647@xymon.invalid>
Cc: xymon at xymon.com
Subject: Re: [Xymon] questions

Chad

What does the director do? How does it communicate with the servers?

Does the director or the server create a log message when there's a problem? Xymon can detect and alarm on that.

Does the director connect to the servers on a specific TCP port? If that port is rejecting a connection, the Xymon server can test for that (every 5 minutes, but can be more often) and alarm on it.

When a server fails, does it stop listening on a particular TCP port? Or perhaps a process crashes and restarts, causing the connection to fail? Xymon can test for these and alarm when it detects a missing listening or established TCP socket, or a missing process.

It's also possible to write a script to have Xymon look at the process listing "ps" output, and look for a particular process's lifetime, and alert when it's less than 5 minutes.

One thing to note is that Xymon's probes and processes typically look for things every 5 minutes. Transient failures that come and go within a few seconds may not be detected using the standard probes and checks. However, the frequency of some of these probes can be increased to make it more likely to catch failures. But also, a custom script can be written to check the state as often as you need. However for transient faults, it's more reliable to look for artefacts of a failure (log errors and warnings, short process lifetime) rather than periodically checking for a successful state.

J


On 22 June 2017 at 22:59, Chad Rodriguez <user-d4c92497b647@xymon.invalid<mailto:user-d4c92497b647@xymon.invalid>> wrote:
Symptom, we open up director and see application servers not communicating at the same time we can ping server by hostname and IP>

Respectfully,

Chad Rodriguez | Systems Administrator
XXXXX N. XXth Ave., Phoenix, AZ XXXXX
office: XXX-XXX-XXXX | fax: XXX-XXX-XXXX
email – user-d4c92497b647@xymon.invalid<mailto:user-d4c92497b647@xymon.invalid>
[PetSmart_logo_email.jpg]
Upcoming Out-of-Office dates:
June 26th through July 4th
July 21st

From: Jeremy Laidman [mailto:user-71895fb2e44c@xymon.invalid<mailto:user-71895fb2e44c@xymon.invalid>]
Sent: Thursday, June 22, 2017 3:23 AM
To: Chad Rodriguez <user-d4c92497b647@xymon.invalid<mailto:user-d4c92497b647@xymon.invalid>>
Cc: xymon at xymon.com<mailto:xymon at xymon.com>
Subject: Re: [Xymon] questions

Chad

Situations like what exactly? When a server is rebooted? Or when a server stops communicating? Can you explain what symptoms? What is a "director"? Sorry, I'm not familiar with the Solarwinds product.

Out of the box, Xymon can detect a few different types of communication issues (eg ping checks, TCP port responses) as well as monitoring logfiles for messages that indicate trouble. Furthermore, Xymon is highly extensible, so if you can write a script to perform a test for your problem, you can turn it into a message for Xymon to display, and optionally alarm via email or other means.

Cheers
Jeremy


On 22 June 2017 at 07:07, Chad Rodriguez <user-d4c92497b647@xymon.invalid<mailto:user-d4c92497b647@xymon.invalid>> wrote:
We have no monitoring in place other than solarwinds which monitors heartbeats. Essentially we have a few application servers that are randomly not communicating with the director and were having to reboot them. Seeing if your application would alert on situations like this in an email notification format?

Respectfully,

Chad Rodriguez | Systems Administrator
XXXXX N. XXth Ave., Phoenix, AZ XXXXX
office: XXX-XXX-XXXX | fax: XXX-XXX-XXXX
email – user-d4c92497b647@xymon.invalid<mailto:user-d4c92497b647@xymon.invalid>
[PetSmart_logo_email.jpg]
Upcoming Out-of-Office dates:
June 26th through July 4th
July 21st